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Chapter 1 


Introduction 


I am the people — the mob — the crowd — the mass 
Do you know that all the great work of the world is done through me? 

— Carl Sandburg, in I Am the People, the Mob (1916) 

Power is the great aphrodisiac. 

— in The New York Times (January 19, 1971) 


Concurrent processing is becoming a progressively more popular field in computer 
science. The vision of harnessing previously undreamt of computational power at a reason¬ 
able cost is leading the drive. By connecting many moderately powerful microprocesors in a 
communications medium, system designers hope to be able to take advantage of the collec¬ 
tive power of the architecture to solve tasks that were previously time or cost-prohibitive. 

Unfortunately, the eager concurrent system designer soon finds that many issues 
are still unresolved. Though people have a fairly good grasp of ways to build successful 
sequential machines, it is less clear how to build optimal, or even acceptable concurrent 
systems. The designer is soon faced by a barrage of questions that are difficult to answer. 
“What grain of parallelism should be supported?” “What level of functionality should the 
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processors provide?” “How should the processors communicate?” “How tightly coupled 
should the processors be?” “How should memory be managed?” “How should the load be 
distributed?”. Many research groups are attempting to answer these questions at this very 
moment. 

Some insight into concurrent architectures has been gained over the years, and 
the current directions of research reflects the knowledge gained. Multicomputer networks 
(sometimes called ensemble machines”) are one direction that concurrent systems research 
has taken. This genre of machine connects relatively conventional microprocessors via an 
automatically routed network. The design is advantageous because it takes advantage of well 
understood sequential processor technology for the processing nodes, and the performance of 
the system can grow proportionately with the number of processors 1 , providing scalability. 

For the past two years, the Concurrent VLSI Architecture Group at M.I.T. has been 
designing a concurrent processing network, christened the Jellybean Machine, under the 
direction of Professor William Daily [Dal86c]. The goal of the Jellybean Machine project is 
to design a scalable concurrent processor out of low-priced (jellybean) parts, that efficiently 
supports an object-oriented execution model. The processor is targeted at both symbolic 
and numeric applications, and will be programmed in high-level, object-oriented languages. 
It hopefully will serve as a succesful example and a test bed for advanced concurrent systems 
research. 


1.1 Scope of Thesis 

This thesis report describes the design and implementation of an operating system prototype 
for the J-Machine. The operating system was required to support a global namespace across 
the distributed processors, allocate memory in an object-based storage model, support 


1 at least up to some point. 
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inter-processor communication, provide system services to control code execution, object 
migration, and an object-oriented calling model. It also provided a perch from which more 
advanced issues in system design could be studied. 


1.2 Highlights of Contributions 


In the course of the design of the J-Machine operating system, several ideas were developed 
that may be of special interest to the designer of multicomputer networks. 

• In section 3.4, I describe a virtual addressing system that resolves objects names 
across distributed nodes by a mechanism known as hometown addressing. This scheme 
delegates to object birthnodes the responsibility for knowing current object residences, 
permitting object migration. An accompanying mechanism of “hints” is provided to 
improve performance. 

• To simplify the hardware with minimal cost in flexibility, we have developed an ex¬ 
plicit, one time virtual translation scheme via the XLATE machine instruction, that 
converts a virtual address to a physical one. Retranslation is provided for automati¬ 
cally by fault handlers. 

• Chapter 5 describes a low overhead code execution model that supports inexpensive 
remote procedure calls, local caching of code, and convenient suspension and resump¬ 
tion of processes. 

• Section 5.4 describes a system for fast context creation that involves the re-use of old 
context objects. This is an important optimization based on the short life and rapid 
freqency of context allocation. 

• Section 5.6 outlines a simple and fast, resource distribution mechanism that li mi ts 
bottlenecks and cross network traffic by dynamically creating a type distribution tree 
for the resource. 
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1.3 A Closer Look At The Jellybean Machine 



The J-Machine is composed of many custom RISC microprocessors called Message-Driven 
Processors or MDPs. These processing elements have small, local memories and are con¬ 
nected in a loosely coupled network. Inter-node communication is provided via message 
sends that are automatically routed to the proper destination nodes. A virtual object- 
based memory abstraction is built over the distributed nodes providing a u nif orm global 
namespace. Various levels of low-cost execution control provide a reasonably fine grain 
of concurrency (on the level of 30 instruction procedures). An object-oriented execution 
model is built upon this fine-grain execution model. The rest of the system implements 
miscellaneous system services and mechanisms to improve performance. 
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1.4 Background 

Concurrent architecture design has been seriously studied for at least the past fifteen years, 
but there is still much to be learned. The various visions of machines, operating systems, 
and target applications are so diverse, that few definitive statements can be made. 

We see SIMD parallelism, promoted by vector operations as seen in the Cray. More 
complicated architectures like the Connection Machine [H1185], and systolic array processors 
like the Warp [Kun82] are alternative approaches, providing fine-grain concurrency with 
repetitive processing while permitting reconfiguration. MIMD architectures are just as 
diverse. There are extremely fine-grain dataflow machines like the Manchester Machine, 
Sigma-1, and the MIT Tagged-Token dataflow Machine [Aea80], bus-based shared memory 
architectures like the IBM RP3, Inmos Transputer, and C.mmp [WLH81], multicomputer 
networks like the Cosmic <_ :be [Sei85] and Cm* [OSS80] and distributed systems like System 
R* [Lin80]. 

The Jellybean Machine, while borrowing ideas from successful research endeavors, 
has goals unique enough to gain a somewhat different character from other machines of 
its genre. It communicates via message passing and addresses only local memory, as in 
the Cosmic Cube [Sei85] and the Medusa system [OSS80]. On the other hand, these two 
systems control execution by a system of pipes and locks, where processes wait for data to 
arrive via messages. The J-Machine, instead, uses message sends to schedule processes, and 
not to provide socket-to-socket communication. State manipulation doesn’t involve explicit 
connections between running processes. Instead, return values are propagated around to 
slots in contexts and code is executed when results arrive in a more “functional” manner. 

Many systems also have virtual memory and some systems use an object or segment 
based storage model [WLH81] as does the J-Machine, but the emphasis is slightly different 
in our design. Where most systems use a virtually addressed, multi-level memory system 



CHAPTER 1. INTRODUCTION 


13 


to expand primary memory and provide relative address mapping, the J-Machine uses a 
virtual addressing system to provide a global namespace across all nodes and to provide 
convenient access to objects as the primitive memory metric. This is more similar to large, 
complexs'distributed systems such as IBM’s distributed database, System R* [Lin80] than 
conventional parallel processors. 

Finally, the J-Machine targets itself to a high-level programming environment. The 
RISC processing node, called the Message-Driven Processor [HT88], provides a fast, power¬ 
ful substrate for the execution of high-level languages, such as Smalltalk. There are several 
architectures designed for the efficient execution of high-level language applications, such 
as the Symbolics Lisp Machine and the SOAR Smalltalk processor [Ung87], but very little 
work has been done targeting concurrent processors to high-level languages. 

1.5 Organization 

The rest of this report will discuss the structure of the Jellybean system. Chapter 2 provides 
a high level layering of the Jellybean system — from single processing node hardware to the 
high level programming of the entire concurrent processing network. Chapter 3 describes 
the memory management and addressing system. Chapter 4 discusses the machine as a 
distributed system supporting object migration to balance load. Chapter 5 explains code 
execution on the method level, and 6 details the object-oriented calling extensions. Storage 
reclamation issues will be introduced in chapter 7. Chapter 8 discusses some of the services 
provided to support high-level language constructs and to control code execution. Chapter 
9 describes the prototype operating system implementation noting its successful as well as 
not-so-successful features, and discussing some of the difficulties and quirks faced by the 

system designer. The report concludes with a performance evaluation and summary in 
chapters 10 and 11. 



Chapter 2 


The Execution Model of the 
Jellybean Machine 


These unhappy times call for the building of plans ... 
that build from the bottom up and not from the top down 

— Franklin Delano Roosevelt, in his April 17, 1932 Radio Address 

The Jellybean Operating System Software (JOSS) is built in a layered manner where 
each layer provides a different model of functionality to the machine. Figure 2.1 attempts to 
describe this layering, and what new functionality each layer provides to the entire system. 

At the bottom of the figure lies the base processor and boot code. At this stage, 
the processing node can be initialized, and can run independently as a limited micropro¬ 
cessor. The addition of system call and fault handlers provide a level of system services 
and robustness to the microprocessor, allowing it to allocate memory in an object-based, 
virtually addressed manner, and to handle various types of exceptional conditions at run 
time. These first two levels of the Jellybean system build up the abstract processing node 
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Execution Model Functionality 


User programming language 

Simple machine independent target language 

Class/Selector calling model 


Remote Method Calls 

Communication 
Distributed Namespace 
Concurrent computing 

Object-based memory allocation 
Optimistic code generation 
Virtual Namespace 
Assorted System Services 

Simple instruction set, tagged, local memory 
Fast priority switches 


High Level Languages 


Intermediate Code 


SEND Message Handler 


CALL Message Handler 


Primitive Message Support 


System Calls 
and 

Fault Handlers 


Machine Code 


Figure 2.1: Layering of JeUybean System 
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capable of executing machine code and performing a set of system services. 

Concurrency is provided as the next level of functionality by the introduction of 
primitive message handlers. Each processing node has the ability to send messages to anv 
other node, where a message is simply a physical address to start running on a foreign node, 
followed by routine-specific data. Thus, a Jellybean primitive message is actually just a way 
of changing a program counter of a remote node. A set of common operations can be placed 
in identical physical memory locations on each node, so that an operation can be run on any 
node by mailing that routine’s address to the node. The operating system provides a small 
set of primitive message handlers to perform common operations which reside in the same 
locations on each node. With this small set of locked-down routines, the machine gains the 
ability to compute concurrently, to use a global addressing abstraction over the physically 
distributed memories, and to perform some amount of object migration and other control 
of resources. 

Two special primitive message handlers are special, in that other system services are 
built on top of them. The CALL message handler provides a mechanism for starting code 
contained in virtually-addressed relocatable objects, rather than just code that resides at 
locked-down physical addresses. This provides a convenient way of packaging objects and 
supporting remote procedure calls. The SEND message takes the code execution mechanism 
to an even higher level, and provides for a dispatch-on-type calling model as used in object- 
oriented systems like Flavors or Smalltalk. 

The final two layers of the system are the interfaces for the programming models. 
The Jellybean Machine under this highest level of abstraction appears to the user a system 
to run high-level languages like Smalltalk. 

The rest of this chapter will go into the abstractions in more detail, describing what 
functionality each level of the machine provides. It may be helpful to refer back to figure 
2.1 as you read the following sections. 
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Each node of the Jellybean multiprocessor (a Message-Driven Processor ) is a tagged- 
architecture microprocessor with a small on-chip memory with separate register sets for 
operating at two priority levels. 

2.1.1 Machine Code 

The machine code interpreted by a Message-Driven Processor (MDP) is a simple 3 operand 
instruction set [HT88J. Code is executed sequentially, and changes in control are provided 
by simple conditional and unconditional branches. The instruction stream is accessed via 
two registers, one that points at the base of the code block (AO), and one that indicates 
the current offset into this block (IP). 

2.1.2 System Calls 

The processor also has a small fixed length stack, and a mechanism to make system calls. 
This provides us with the ability to change control to common subroutines, and easily restore 
execution upon return. The addition of the system call machinery gives us the ability to 
provide several extensions to the processor in terms of system services written in machine 
code. Heap management, and an object-based memory allocation model are provided with 
system calls, as are the mechanisms to address these objects with relocatable, virtual IDs. 

2.1.3 Fault Handlers 

Similar to system calls, the MDP also contains a fault handler table providing software 
routines to run when instructions fault because of various exception conditions (tag mis¬ 
matches, addressing past segment, integer overflow, translation buffer lookup miss, etc.). 
When a fault occurs, the IP is pushed onto the stack, and the appropriate fault routine 
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(found in the exception vectors table) is run. An address of each fault handlers is placed 
in the exception vector table by software initialization. The addition of the fault handlers 
gives us several advantages in our quest of an object-oriented concurrent processor. We can 
use tag checking to support optimistic code generation and a type of “generic operation” 
approach on the machine code level. The fault handlers also provide us the ability to effi¬ 
ciently implement virtual ID lookup via the XLATE instruction. The fault handlers will be 
described in more detail later when the entire system has been more thoroughly explained. 

Since both the system calls and fault handlers are supported by a software initialized 
vector table, the processor can be “reshaped” into a different type of machine by replacing 
the ROM code that sets up this table. Only the instruction set is fixed, allowing the MDP 
processing node to be used as a basis for various alternative concurrent processing system 
paradigms. 

2.1.4 The Basic Node of Computation 

With what we have described so far, our processor is a sequential machine, able to be 
executing in one of two priorities. It refers to its instruction stream using physical memory 
base and offset registers. The addition of the system calls provides an interface to OS 
services, such as those to allocate memory, generate virtual object IDs and to manage object 
ID to physical address translation. The fault handlers permit us to develop “optimistic” 
code, where a normal, error-free execution will proceed rapidly, and we only pay the price of 
software execution if an error condition occurs. The fault handlers are also used to support 
a fast virtual namespace, where translation can be as fast as the XLATE instruction. 

The sum is a flexible, object-based microprocessor that will serve as our basic node 
of computation as we venture into the realm of concurrency. 
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2.2 The Concurrent Processor Model 

By providing mechanisms for node-to-node communication, our machine becomes a mul¬ 
tiprocessor, called the Jellybean Machine. Many MDP processing nodes (as well as other 
potential nodes such as floating point processors and memory nodes) are connected together 
in a network. Communication between the nodes is provided by the MDP SEND instruction 
which injects messages into the network. The messages are routed by routing hardware to 
the message queues on the destination node. 

Messages received by an MDP processing node consists of two parts, a message 
header which contains the address of the primitive message handler to run, and a sequence 
of message specific data words. The header of the message acts in effect like a process 
descriptor for providing efficient message execution. When a message arrives at the specified 
node, it lands in the destination node’s queue. The queue acts as a FIFO scheduler of 
primitive message processes. When the message moves to the head of the queue, the MDP 
executes the message by setting the instruction pointer register to point to the primitive 
message handler whose address is in the header of the message. 

Several useful system services are written as primitive message handlers. Examples 
of primitive message handlers include those to make a new object on a node (NEW.MSG) 
and to request a copy of a method from a node (METHOD_REQUEST_MSG). 

With the addition of primitive messages, we have the ability to process concurrently, 
and to support a distributed namespace. We can now extend our virtual memory system 
to support naming of objects, not just in the local memory, but on any node in the entire 
network. With a distributed namespace, we gain flexibility of resources. We ran migrate 
objects as we need them to balance load and to free up memory. 
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2.2.1 Methods and the CALL Message 

Up to this point, we have only been able to run foreign code that resides at fixed physical 
locations. We desire a more flexible mechanism for dealing with blocks of code, such as those 
that will be output by compilers. Since we already have an object based storage model, 
it would be very convenient to store code routines in objects and provide a mechanism for 
their execution. We call code routines stored in virtually addressed, relocatable objects 
methods to differentiate them from physical locked down code sequences. We provide a 
mechanism to start these methods executing by writing a primitive message handler called 
the CALL message handler. When a CALL.MSG starts executing on a node, it runs the 

method indicated in the message argument. This allows us to have a flexible system of 
remote procedure calls. 

2.2.2 SENDing Selectors to Objects 

The final operating system layer in our quest for an object-oriented execution model is 
the SEND_MSG message handler. A SEND_MSG consists of a selected generic operation, 
represented by a unique symbol called a selector , followed by the object(s) that the selector 
acts upon. If we wanted to send the DRAW selector to an object (say a triangle), we 
would SEND a SEND_MSG message to the node the triangle object resides on, passing the 
selector DRAW, and the virtual address of the triangle object receiving the selector (called 
the receiver). When the SEND.MSG handler gets executed, it determines the appropriate 
method to run, and then remotely calls the procedure by sending a CALL.MSG message 
to this method which then draws the triangle. 

In order for this system to work it is necessary to maintain certain system tables 
that map pairs of selectors and object classes with the virtual IDs of methods to perform 
the desired information. It is also necessary to insure that semantically indentical selector 
operations get the same selector symbol. In other words, all PLUS operations must get the 
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same symbol representing +. The exact mechanisms of the class/selector system will be 
described in more detail in chapter 6. 


2.3 High Level Language Model 

For the final part of our tour of the Jellybean Machine, let us step back once more, and 
view the machine from the perspective of the programming languages that will be used to 
write user programs. 

2.3.1 Intermediate Code 

To provide a uniform target language for compilers, we have specified an intermediate 
language called i-code. This language has a simple set of operations, and a simple manner of 
referencing operands. By passing the send code through a code generator and a linker/loader 
we can store actual MDP machine code on nodes. The i-code level of the system provides a 
convenient entry point for various compilers that necessitates no knowledge of the underlying 
layers. All interaction is via the protected subsystem of the i-code interface. This interface, 
in effect, provides an abstract i-code machine that can be of use in many different machine 
configurations. Implementations of this interface on different machine architectures would 
provide a convenient way to reuse compilation tools and compare system performance. 

2.3.2 User Languages 

The user language model is what would be seen by the user of the Jellybean Machine. He/she 
would be faced with the language interaction shell and would see none of the internal layers 
that compose the system. The currently supported user language is a prefix notation form 
of concurrent Smalltalk [DC]. Other languages, such as a Lisp with flavors should also be 
possible. 



Chapter 3 


Memory Management and 
Addressing System 


Work without hope draws nectar in a sieve 
And hope without an object cannot live 

— Samuel Taylor Coleridge, in Work Without Hope 

Oh call it by some better name 
For friendship sounds too cold. 

Thomas Moore in Ballads and Songs: Oh Call It by Some Better Name 

The Jellybean Machine, targeted for object-oriented applications, needs to have an 
object-based storage model. This chapter sketches the machinery that interact to provide 
this model. The mechanisms basically consist of two parts, (1) the services to allocate and 
deallocate contiguous blocks of physical memory, and (2) the virtual addressing abstractions 
that make objects the basic unit of storage. This virtual address allows object relocation 
and provides a way to reference storage on foreign nodes. Virtual naming and physical 
allocation systems combine to form an object based programming system. 


22 



CHAPTER 3. MEMORY MANAGEMENT AND ADDRESSING SYSTEM 


23 



Figure 3.1: Schematic Model of the Memory System 


At the heart of the object based system is the NEW system call, which creates a 
new object. This routine utilizes the 3 object system subsystems, the translation manager, 

the name manager, and the memory manager. This interaction of the various systems is 
shown in figure 3.1. 
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3.1 “Freetop” Contiguous Heap Allocation 

Each node of a Jellybean Machine has its own local memory that can be accessed very 
rapidly. Part of this local memory is reserved as a heap to allocate blocks of memory from. 
Heap allocation is done in a straightforward “freetop-next” manner. Memory is allocated 
starting from the current top of free memory, and the freetop pointer is moved past the 
block allocated. The ALLOC system call handles the allocation requests. 

3.2 Compaction is Fast 

Deletion of objects fragments the heap leaving unused “holes” in the heap. We reclaim this 
storage by sweeping objects down toward the base of the heap, to fill up the blank space, 
with the freetop following accordingly. Since each local memory is small and fast, and 
each processor can sweep in parallel, compaction takes very little time. Figure 3.2 shows a 
process of heap allocation, deletion, and compaction. 

3.3 Physical Base/Length Addressing 

Blocks of memory are described by physical base/length values supported by the processor’s 
primitive ADDR data type. The base is the starting address of the block of memory, and the 
length is used for access bounds checking. The format of an ADDR tagged value is shown 
in figure 3.3. The tag of the physical address word is a unique number ADDR representing 
a physical address value. The R bit is used to specify that an address value points to a 
relocatable object. The I bit specifies that the address is now invalid. Both of these bits 
are used for the implementation of virtual addressing. 
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Empty Heap Allocate Objects Delete Objects Compact 


Figure 3.2. Freetop Heap Allocation, Deletion, Compaction 



Figure 3.3: A Physical Address Word Format 
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Figure 3.5: A Virtual Address Word (ID) Format 


format of this virtual ID is shown in figure 3.5. There are also several utility routines used 
to manage the virtual -> physical translation table (called the Birth/Residence Address 
Table, or BRAT). These routines add, lookup, and remove bindings from the translation 
table. They are implemented by the extended system calls BRAT .ENTER, BRAT .XLATE 
and BRAT .PURGE respectively. Finally, we provide the NEW system call to allocate and 
install a new object. This service allocates physical memory, generates a virtual ID, installs 
the virtual -> physical binding in the BRAT, and returns both the ID and the address. The 

NEW system call is to the virtual addressing model as ALLOC is to the physical addressing 
model. 

3.4.3 Translation Buffer 

To speed up translation, each processing node has a 2-way set-associative translation buffer, 
and the accompanying ENTER, XLATE, and PURGE machine instructions. The XLATE 
instruction will fault if no binding is found in the cache, and a software exception handler 
will be run to resolve the name. 
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Figure 3.4: The Structure of an Object 
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Figure 3.5: A Virtual Address Word (ID) Format 



format of this virtual ID is shown in figure 3.5. There 


are also several utility routines used 


to manage .he vir.ua] physical translation table (called the Birth/Residence Address 

Table, or BRAT). These routines add, lookup, and remove bindings from the translation 

table. They are implemented by the extended system calls BRAT .ENTER, BRAT JCLATE, 

and BRAT .PURGE respectively. Finally, we provide the NEW system call to allocate and 

install a new object. This service allocates physical memory, generates a virtual ID, installs 

the virtual physical binding in the BRAT, and returns both the ID and the address. The 

NEW system call is to the virtual addressing model as ALLOC is to the physical addressing 
model. 


3.4.3 Translation Buffer 

To speed up translation, each processing node has a 2-way set-associative translation buffer, 
and the accompanying ENTER, XLATE, and PURGE machine instructions. The XLATE 
instruction will fault if no binding is found in the cache, and a software exception handler 
will be run to resolve the name. 
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Figure 3.6: Format of the Translation Buffer 


16 Rows 
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3.4.4 Automatic Retranslation 


To support maximum efficiency in normal case situations, the processing node provides an 
“invalid” bit in each address (A) register. If this bit is set, it signifies that the ID and A 
register have values that are no longer consistant. Any access of an invalid A register will 
cause a fault handler to be run which will retranslate the ID register into the A register 
and continue. This way we can be “lazy” and retranslate invalid bindings only if needed. 


3.5 Summary 


Physical block allocation is used to reserve segments of memory. Virtual IDs are associ¬ 
ated with these blocks of memory, and bindings are formed, to provide an “object-based” 
allocation model. This object allocation model provides the following benefits 

• An abstract memory model, where “objects” are the primitive metric of storgae rather 
than physical addresses. 

• A location independent memory model with indirection through a translation table, 
allowing ease of relocation. 

• The ability to represent the data types of objects. 

• The introduction of a global namespace where we can refer to objects residing on any 
node of the network. 
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Chapter 4 


Distributed System Support 


I pity the man who can travel from Dan 
to Beersheba and cry, ’Tis all barren! 

— Lawrence Sterne, in A Sentimental Journey (1768) 

In the previous chapter we developed a object based allocation model and a global 
naming system. With this functionality, we gain much greater flexibility. We take this 
system one step further in this chapter, as we describe a mechanism to migrate objects 
from node to node. This added ability requires a few extensions to the virtual naming 
model presented in the previous chapter. 

4.1 The Idea 

In the previous naming model, virtual IDs were bound to physical addresses. Since objects 
were not allowed to migrate, they were forced to always reside on their birthnode. Now that 
objects are allowed to emigrate to different nodes, we need to expand our name resolution 
system. In addition to virtual —► physical bindings we add a virtual —► node-number 
binding semantically representing a “hint” that the object in question now resides on a 
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node #1 



node #2 

ID 1-* node #2i 






Objl 
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Figure 4.1: An Example of Hints 


different node number. Figure 4.1 shows that node #1 has a hint that an object is on node 

# 2 . 


4.2 Chaining of Hints 

These node number hints” indicate another node to look on for the object in question. The 
current implementation allows chaining of hints (although cycles will never form). If we ever 
follow a path of hints and find no binding for the object ID, we then query the birthnode 
which is required to have a path to the object in question. Figure 4.2 is a snapshot of a 
system where a chain of hints has formed to an object. 

A question then arises as to how long to let these chains of hints be. Some distributed 
systems, such as System R* [Lin80], only allow paths of length 1, i.e. one hint. If the 
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node #1 
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node #4 
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ID 1-* node #2 


node #6 


Figure 4.2: Chains of Hints 
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object is not one hint transition away, the system then defaults'to the birthnode where 
the location of the object is found, and the previous incorrect hint is updated. However, 
in our system we choose to have multiple hints because objects may migrate quite a bit, 
and this would increase the number of birthnode accesses. Performance could significantly 
degrade if a popular object moved quite a bit (as we would expect popular objects to do). 
If we notice in later performance experiements, that chains of hints become commonplace, 
adding latency and unnecessary network traffic, we can adopt one of 2 solutions, (1) only 
allow one hint or (2) collect and update old hints periodically. 


4.3 Calculating Likely Nodes From Object IDs 

The operating system provides a system call for finding a likely node that an object resides 
on. This ID_TO_NODE call takes the virtual ID of the object and returns a node number. 
It does so by the algorithm charted in figure 4.3. It works in the following way. The virtual 
ID is looked up in the translation table. If it is not there, we have no idea where the object 
is, so we check the birthnode. If there is a binding, but the binding is to a hint (an integer 
value), we return this hint as the probable residence node. Finally, if the binding is to a 
physical address, the object is local, and the local node number is returned. 


4.4 Virtual To Physical Translations In The Migrant Ob¬ 
ject World 

Now that objects are allowed to wander aimlessly across the nodes of the Jellybean Machine, 
virtual to physical address translations are necessarily slightly more sophisticated. Three 
conditions can occur when we attempt to translate a virtual ID into a physical address. 

1. We find a physical address value for the binding 

2. We find a hint to where the object currently resides 
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3. We find no binding for the object 

Case 1 is the normal situation. The physical address associated with the object ID is 
returned. Case 2 implies that the object is rumored to be on a foreign node. We then 
send a request to this node asking that the object be shipped here for processing, and we 
suspend our process onto a wait list. Case 3 occurs when a node has no idea where an 
object resides. In this case, we send a request to the birthnode asking for the object. If the 
birthnode doesn’t know where an object is, it loops, mailing messages to itself, assuming 
the object is in a state of transition somewhere. 


4.5 Bouncing Objects 

Note that this method of finding data objects may cause them to bounce around from node 
to node, as different processors wish to compute on them. This is the direct result of several 
design decisions: (1) each processor executes only one task at a time, (2) memory is not 
shared among processors, (3) mutable data objects are not cached, and (4) an object’s data 
lies entirely on one node. The first and second decisions are fundamental to the design of 
our machine. We chose the grain size and memory model to provided a moderately fine 
grain, highly scalable processor. We chose not to do object caching because it is expensive 
to do in software, and is difficult on a network based memory model. It may be possible to 
provide coherent caching in the future however. The final restriction, that an object’s state 
is contained on one node only is for simplicity’s sake, and can be at least partially lifted by 
the introduction of “distributed objects” described in a later section. 

So, with these characteristics in mind, it becomes important for us to try to prevent 
unnecessary “pinging” of objects from node to node. One way this is done is by “sending 
work to the object” rather than “sending the object to the work”. Unfortunately, this is 
difficult to do in the general case due to problems with transferring processor state. As a 
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compromise, we set the following policy. 


1. If we were sending a selector to an object, and the object is not local, we forward the 
selector to the location of the object 1 . 


2 . 

3. 


If we were accessing a non-local, immutable object 
request a copy of the object, and restart execution 


, we halt, saving our process state, 
when the copy arrives. 


If we were accessing a non-local, mutable object, we halt, saving 
move the object here, and restart when it arrives. 


our process 


state. 


This policy reduces the severity of the “pinging” problem, because work tends to accumulate 
at the object, while at the same time, allowing the object to move if it has to. 


4.6 Details About Object Migration 


This section formalizes the mechanisms provided to migrate objects. When we try to access 
a non-local object, we mail away to request a copy of the object or to move the object 
(depending on whether the object is immutable or mutable, respectively) 2 . When we wish 
to request a non-local object, the following steps are taken: 


1 . 

2 . 


Jh e Processor state is saved in a context object, and the context is marked waiting 
tor the ID of the object being requested. 


The context is placed in 
objects. 


a resource wait table that indicates processes waiting on 


3. A MIGRATE-OBJECT message is sent to the best guess residence of the object, 
asking it to be migrated to the requesting node, and the process suspends, able to 
execute the next message in the queue. 

4. This MIGRATE-OBJECT message is forwarded down the chain of hints. If it lands on 
anode with no binding for the ID in question, the search continues at the birthnode. 

rinaiiy this message arrives at the node the object resides on, and the message handler 
is run. ° 

^ object in question is marked unmovable , then the message is sent back to 

the start of the queue, otherwise the message handler decides whether the object is 
mutable or not, and acts depending. 




If it is mutable, the binding 
an IMMIGRATED BJECT 

is deleted. 


s are removed from this node, the object is mailed in 
message back to the requesting node, and the object 


The class/selector late-binding activation model is discussed in detail in chapter 6. 

Since a process cannot be interrupted by a same priority message, it does not suffer from livelock and 
can always make headway. 
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• If the object is read-only, the data is mailed in an IMMIGRATE.COPY message 
back to the requesting node. 6 


6. These messages eventually arrive back at the requesting node. 

• When a IMMIGRATE.OBJECT message arrives, the message handler (1) allo¬ 
cates the object, (2) marks the object unmovable (until it can update the birthn- 
ode, to prevent a race condition where hint updates may occur out of sequence) 
(3) copies the data into the object, (4) mails a NOW_RESIDING_AT message to 
the previous node of residence, and (5) calls the RESOURCEjVRRIVED system 
call, which will queue the restart of the waiting contexts. 

• When a IMMIGRATE.COPY message arrives, the handler (1) allocates the ob¬ 
ject, (2) marks the object header as a copy, (3) binds the old ID to this new ob¬ 
ject, (4) copies the data into the object, and (5) calls the RESOURCE-ARRIVED 
system call, which will queue the restart of the waiting contexts (copies can be 
collected when storage runs low). 


7. The NOW_RESIDING_AT message makes a hint from the current node to the new 
node, and mails a UPDATE_BIRTHNODE message to the birthnode of the object, 
telling it of the object s new location. 


8 . 

9. 


The UPDATE JIRTHNODE message makes a hint to the new location and mails an 
UBJEC 1 .MOVABLE message to the location of the new object, passing its ID. 


The OBJECT-MOVABLE message marks the object movable. Now the object is free 
to move again. 


Figure 4.4 shows an example of this process. 


4.7 Summary 

The addition of a mechanism for object migration adds much more flexibility to the Jelly¬ 
bean system. Without imposing policy, the migration and copying system provides the 
basic mechanism for resource sharing. To alleviate name resolution bottlenecks at object 
birthnode, J designed a system of cycle-free hints to indicate where objects currently lie. It 
is not clear how long to allow these chains of hints to be. Long chains of hints would cause 
unnecessary network traffic and increase latency. Having single hints would increase the 
number of birthnode accesses and require mechanisms for removing old links. The system 
currently supports chains of hints. 
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Figure 4.4: Step-by-step Object Migration 
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Chapter 5 


A Virtually Addressed Code 
Execution Model 


They shall mount up with wings as eagles; 
they shall run, and not be weary, and 
they shall walk, and not faint 

— The Holy Bible, Isaiah, 40:31 

At the most primitive level, we could execute physically addressed blocks of machine 
code by directly setting the registers, or by sending primitive messages. Unfortunately, 
we have no mechanism to allocate or relocate these blocks of code, they are physically 
addressed and sedentary. This chapter presents the system mechanisms that interact to 
provide a more flexible, but low overhead model for code execution by taking advantage of 
the virtually-addressed, object-based storage model we developed in the last 2 chapters. 

I will present (1) the advantages of an object-based code model, (2) the mechanisms 
for executing object-based code, (3) local caching of methods, (4) contexts, suspension, 
and waiting for resources, and (5) efficient ways of distributing code models across a large 
network. 
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Figure 5.1: Format of the CALL Message 


5.1 Taking Advantage of Object Storage 

By taking advantage of the object storage and naming system we developed, we are able 
to wrap threads of code inside objects and gain all of the benefits of this more powerful 
object-based abstraction, of which a few are: (1) dynamic allocation, (2) relocation, even 
across nodes, and (3) convenient naming and name resolution. This view of code blocks as 
objects (or methods , which is what we call code blocks that are wrapped in objects) allows 
us to consider more advanced calling models, such as the ability to conveniently support 
remote procedure calls (RPCs) and the flexibility to “send the work to the data” rather 
than just the typical mechanism of “bringing the data to the work”. 

5.2 An Overview of the CALL Message 

Ignoring for the moment the question of initially creating methods, let’s concentrate on the 

mechanisms needed to execute them. The operating system provides a primitive message 

handler for a CALL message. To start a method running, we mail a CALL message to the 

node the method resides on 1 , passing as arguments the virtual ID of the method to execute, 

Since we build this on top of the virtual, distributed namespace model, we can use hints to make our 
best guess where method resides. 
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and any data the method expects as parameters. The format of the CALL mesage is shown 
in figure 5.1. When the CALL message arrives at the node it first checks if the method is 
here. If so, the code is started. If not, rather than forward the message to the birthnode, 
we note that 

1. Methods are immutable, and therefore can be copied 

2. Certain methods might tend to be called often from many nodes 

and adopt a policy of copying the method to this node. This way we provide local copies 
on many nodes (these can be periodically purged by some appropriate stategy to free up 
memory). 

Once the method is on the node where the CALL message arrived, the message can 
start up the method. It does that by 

• Translating the ID of the method into its physical address 

• Placing this physical address of the code block in AO 2 

• Placing a 2 in the IP register 


These steps will start the processor executing instructions from the method, starting at the 
third word. We skip the first two words of the method, because these hold object header 
information. The steps of the CALL message are schematically charted in figure 5.2. If 
the method somehow relocates on us while we were executing 3 , the process that relocated 
the object will invalidate the AO register. When our process starts again, it will fetch 
an instruction through AO and cause an invalid address fault. This will run an exception 
handler to retranslate the method ID (in IDO) into the physical address (putting it in AO 
again), and we will continue as if nothing had happened. 

2 AO always points the the base of the code currently executed, unless the processor is in absolute mode, 
where this value is treated always as 0, regardless what it holds. The IP register holds the relative offset of 
the program counter within this code block starting at AO. (If we are in absolute mode, the IP register acts 
in effect like an absolute address rather than a relative address, because absolute mode makes the processor 

pretend the value of AO is 0.) 

3 This could be caused by heap compaction, or the method being migrated to another node to free up 
space, among other reasons 
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Figure 5.2: Flowchart of the CALL Message Handler 
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5.3 Caching Method Copies 


Since method code is immutable, we can cache methods, just as we can cache other read-only 
data. To request a copy of a method we: 

1. Allocate a context object to hold our processor state, so we can restart later 

2. Copy the processor state into the context 

3. Place the context in the resource wait table indicating that our context is waiting on 

this requested method 6 

4. Mail off, requesting a copy of the method 

5. When the method arrives, it is placed on our node and our context is restarted 

These cached copies will have the copy bit set in the object header so that the storage 
reclaimer will know that this cached object is a duplicate, and can be purged if space is 
tight. Let s now look in a bit more detail at contexts and this resource wait table, two 
crucial mechanisms for supporting high level execution control. 

5.4 Contexts 

5.4.1 Why Do We Need Them? 

Contexts are just objects that hold the important state of the processor, so the current task 
cab be halted and later restarted where it left off. In addition, contexts can provide space 
for local variables used in the task’s computation. 

5.4.2 How Do We Make Them? 

Contexts are allocated by the NEW .CONTEXT system call. The call takes as an argument, 
the number of additional variables needed, and it returns a context big enough to hold the 
minimum necessary processor state plus the additional variables. When a process is done 
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Figure 5.3: Structure of a Typical Context 
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with a context, it should explicitly deallocate it with the FREE.CONTEXT system call. 
Figure 5.3 shows the format of a typical context. 

As with all objects, the first two words are used by the object manager. The next 
three words are used to hold an offset to the processor state part of the context (for faster 
restarts), a pointer to the next context in a list of contexts, and a value indicating that the 
context is waiting on a particular resource. The context then contains some amount of user 
reserved space follwed by nine words of processor state. The minimal size of a context, with 
no user space is 14 words. 

5.4.3 How Do We Make Them ... Quickly!? 

Since we expect contexts to be used very often, and since we want method startup costs to 
be small and methods to be short, we don’t want a majority of our execution time to be 
spent allocating contexts. To accomodate these constraints, we reuse old contexts rather 
than allocating new ones each time. When a context is deallocated, it is placed back on a 
free context list. The next time a context is requested, we try to re-use one from the free 
list, since this will take only a few instructions. 

However, contexts vary in size, and we wouldn’t want to have to walk the list each 
time to see if we have a context big enough to meet our request. So, we only save contexts 
that meet a common size. This way, any time we request a context of this “common” size, 
we can yank the first one off of the free list and use it. The format of the free context list 
is shown in figure 5.4. 

The first context in the free context list is pointed to by the CONTEXT_FREE_- 
LIST operating system variable. If no contexts are in the free list, the OS variable is set 
to NIL. Each context in the free list points to the next context in the list by the context’s 
NEXT .CONTEXT slot as shown previously in figure 5.3. The final context in the free list 
has its NEXT.CONTEXT slot set to NIL. 



Operating System Variables 
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Figure 5.4: The Free Context List 
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The operating system provides one primitive message (RESTART.CONTEXT) and two 
system calls (XFERJD and XFER.ADDR) to restart a context. The system calls take 
either an ID or a physical address of a context, and restarts it, copying the processor state 
from the context to the processor registers. The restart context message takes a context ID 
and transfers control to it by calling the XFERJD system call on the context ID. 

5.5 The Resource Wait Ihble 

The resource wait table is a system data structure that indicates which contexts are waiting 
for which services. It consists of two parts. The first part of the wait table is a fixed size 
associative table that binds resource IDs to waiting contexts. Figure 5.5 shows a portion of 
a hypothetical table. We see several contexts waiting for ID1, one context waiting for ID2, 
and the rest of the slots are empty. Empty slots are set to NIL. When a resource arrives, 
the wait table is searched, and the contexts in the list bound to the ID are restarted. 

Searching this table is fast, but unfortunately, we can not bound the number of 
entries that try to occupy the table. At some time, we may run out of room. When this 
happens, we resort to a slower form of data structure and link the contexts waiting on 
resources in a list called the resource overflow list. If we don’t find a binding in the table, 
we begin searching the list of contexts. Since each context has a RESOURCE-NEEDED 
slot, we can always tell what resource the context is waiting for. This provides us a way to 
continue if the table becomes full. By sizing the table appropriately, it may be possible to 
limit use of the overflow list to a minimum 
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Figure 5.5: The Resource Wait Table 
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Figure 5.6: The Resource Wait Overflow List 
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Figure 5.7: A Parallel Resource Request Bottleneck in a 3 x 3 Network 


5.6 Removing Method Caching Bottlenecks with Distribu- 
tion Trees 


The current scheme for method caching implies that in many cases, nodes wanting methods 
will have to ask the birthnode of the method (or at least the residence node) for a copy. 
If many nodes simultaneously need the same method (as will likely happen with highly 
parallel execution), then the birthnode will be deluged with method requests which it can 
only handle sequentially. These bottlenecks could degrade performance considerably. For 
example, figure 5.T shows a network of 9 processing nodes. Suppose nodes 2 - 9 all requested 


CHAPTER 5. A VIRTUALLY ADDRESSED CODE EXECUTION MODEL 


52 


a method copy from node 1. Node 1 would receive a barrage of 8 requests for the method 
which would eliminate all parallelism, since it could consider each request only sequentially. 

One way to reduce the threat of performance degrading bottlenecks is to set up a 
distribution hierarchy , so that each node requests resources from its local distribution center 
(the distribution hierarchies are different for different resources). Each of these local centers 
would make requests to its superior, all the way up to the master resource center. We can 
use this type of distribution graph to help in requesting method copies (or copies of any 
type of immutable data for that matter). 

Take again the 3 x 3 node network example, where 8 nodes request a method from 
node 1, but this time impose a distribution bureaucracy like that shown in the tree in figure 
5.8. This time, node 1 only has to handle 3 messages, from nodes 2, 4 and 5. Each of these 
nodes serve as local distribution centers for the remaining nodes. Node 2 services nodes 3 
and 6, node 4 services nodes 7 and 8, and node 5 services node 9. In this manner we have 
permitted more parallelism to continue, as well as limiting the burden on node 1 (which 
could cause queue overflow, network blocking, and other conditions where performance 
degrades considerably). 

Let’s now discuss some ways that a distribution tree method caching scheme can be 
implemented in the Jellybean Machine system software. First, what are the contraints we 
are working under? 

• The distribution tree edges must be easily computable 

• We need to make reasonable choices for branching factor versus tree depth. Too high a 
branching factor might create bottlenecks, but too low a branching factor would tend 
to cache unnecessary copies, and suffer long latency as the birthnode was many edges 
away from the requesting node. 

• We would like to have significantly different trees for different resources. Different 
methods should have different distribution hierarchies, again to decrease bottlenecks, 
and to distribute resources more thoroughly. 

One fairly simple first attempt at a distribution tree formula might be to go to the 
distribution center that is halfway between the current node and the birthnode in terms 




Figure 5.8: A Distribution Tree Bureaucracy 


To Balance Load in a 3 x 3 Network 
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of hops. In other words, to find the next regional distribution center, given the birthnode 

coordinates (x b ,y b ) and our current coordinates at (x c ,y c ), we would calculate the halfway 
coordinates (xi ,yi) by: 
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if sg^J/real < 0 


This is in fact the algorithm used to create the distribution tree in figure 5.8. Figure 5.9 
shows several distribution trees created by this algorithm for networks of various sizes and 
various birthnodes. This method creates trees with depth at most log 2 m + 1 for a network 
with a maximum dimension of m nodes. So, for a reasonable sized machine of 4096 nodes 
(64 x 64) we would at most have to traverse log 2 64 + 1 or 7 edges of the distribution tree. 
For enormous systems, say IK nodes on a side, the tree depth will be only 11. 






Chapter 6 


System Support of a 
Type-Dispatched Calling Model 


We never sent a messenger save with 
the language of his folk, that he 
might make the message clear for them 


— The Koran, 13:11 


One of the most important aims of the Jellybean Machine is to provide a concurrent 
processor that efficiently supports object-oriented, late-binding procedure activations. This 
chapter introduces the idea of message-passing and late-binding programming methodolo¬ 
gies, and discusses the system services in the Jellybean Machine operating system that 
support this manner of programming. 

6.1 Message-Passing and Object-Oriented Languages 

There has been much interest during the past few years in “object-oriented” programming. 
Though this term is not particularly precise, it does describe a fairly cohesive set of languages 
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exhibiting behavior markedly different from the typical Algol-like programming style. There 

are two characteristics in particular that languages typically categorized as object-oriented 
share. 

First of all, operations tend not to be thought of as functions applied to data objects , 
as they are in Algol derivatives. Instead, data objects are “personified” as “actors” that 
receive requests made of them. These requests are made by “sending a message” to an 
object called the receiver of the message. The operation that was requested of the object 
is typically called the selector, since it selects the object to be performed. So, where a 
standard language Algol-like language might calculate the determinant of a matrix m by 

determinant (m); 

and object oriented implementation might look something like 

(send m ’determinant) 

We call this concept of performing operations by sending selectors to objects the message¬ 
passing paradigm. This paradigm turns out to be a very convenient model of computation. 

The second characteristic of object-oriented languages that make them appealing is 
the fact that the operations on different data-types can have the same names. This allows 
us, for example, to have an ’area selector for circle data types, as well as an ’area selector for 
polygon data types. In many other languages this would cause a naming conflict, requiring 
us to set up an explicit naming convention, such as calling circle_area() and polygon_area() 
routines on, objects of the proper type. 

But, more importantly than just saving us the hassle of naming conflicts, object- 
oriented languages actually decide which procedure to run for a certain data type. In other 
words, when an ’area selector arrived at an object, the system would decide whether this 
object is a circle or a polygon and automatically run the correct procedure. In addition, 
if the receiver of the ’area selector was not a data type that supported the area operation 
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(such as an integer), then an error would be reported by the system. In Algol-like languages, 
it is the burden of the programmer to know the type of the object he is dealing with, so he 
can call the proper operation. This is crucial in many symbolic languages with loose type¬ 
checking, like Lisp, where we can have lists of many different types of objects 1 . This is called 
a late-binding activation since we don’t decide what routine will be run at compile-time, 
but instead wait until later, when the message send is actually done. 

Operations with the same name and semantically similar meaning supported by 
various data types are called generic operations since these operations represent the generic 
behavior the programmer wants to accomplish (add things, draw things, calculate areas of 
things). The specific behavior is calculated at run-time once we know the data type of the 
object (called the class of the object), and the selected operation, by a process known as 
class-selector lookup. 


So, object-oriented languages have two main components 

l ' pliSEoSS messa 9 e ~P as3tTt 9 PWipm rather than a more ap- 

2 ‘ l a K ta 11 its ow V et of su PP° rte <l operations, where names can be the same 

as m other data types, and may represent generic operations over varied data types. 
Activations are caused by late-binding sends which lookup the specific operation to run 
based on the class of the object receiving the message (the receiver) and the selected 
operation (the selector ). 


Our goal now is to provide a system substrate that will efficiently and conveniently support 
these aims. 


tvDM of object, onented drawing program, where we have a list of many different 

sv^mis to wnd V vfrVw P lct " re ; A convenient way to refresh the screen in an object-oriented 

system is to jend a draw message to each object in the list. Based on the data type of each object at 

run-time, the appropriate routine (circle draw, rectangle draw, text draw, etc.) is activated 
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SEND 
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I Address 


Selector 

Receiver 

Symbol 

ID 


Optional 
Args 
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Reply 

Reply 

Reply 

ZJ 

ID 
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Node 


Figure 6.1: Format of the SEND Message 


6.2 Late-Binding Send Execution Support 


Tl “ nK " ' Mk ° f “ he ° Peri “ in8 Syttem is * mechanism to simulate the message- 

passing paradigm. We already have network communication hardware that allow, data to 
he sen, between nodes. We Jso have a glohai object namespace provided by the virtuai 

memory extensions. Together, we can use these components to implement the message- 
passing execution model. 

To do this, we implement one more primitive message, the SEND message handler 
(not to be confused with the SEND machine instruction). This primitive message handler 
acts in the object-oriented manner we showed earlier. Figure 6.1 shows the significance of 
the different words of the message. The first word is the address of the SEND message 
handler, ,h, second word is the selector, the third word is the receiver. The res, of the 
words are arguments, and information about where to reply to. 

When the SEND message arrives on the node that the receiver resides on (we for¬ 
ward this SEND message to wherever the receiver resides) the primitive message handler is 
started. Figure 6.2 shows a flow char, tha, describes how th, SEND message handler works. 

I, firs, picks the class our of the receiver object (so we know wha, data type the receiver is). 
We then merge the class and selector together into a class/selector word (shown in figure 
6.3). Now that we have the class and selector, we try to see if there is a class/selector - 
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method ID binding in the cache. If so, we start the method with the CALL message as 
discussed in the previous chapter. If not, we need to lookup the binding. 

At the current time, we do not have enough insight into the characteristics of ma¬ 
chine behavior, to feel comfortable locking down the class/selector lookup algorithm. For 
this reason, we provide the lookup routine in a method. We insist that this method is allo¬ 
cated before any others so it always has the same method ID. This LookupMethod method 
takes the class and selector, and consults some distributed system table to find the method 
ID corresponding to this class and selector. 


6.3 Loading Class/Selector Methods into the System 

Let’s now briefly look at how the class/selector method information is loaded into the Jelly¬ 
bean system. Figure 6.4 shows the schema for how the compiler and run-time environment 
will interact with the Jellybean Machine processing network. The compiler is responsible 
for generating class and selector numbers and for compiling the source language into MDP 
machine code. A certain node of the network is picked for the method to reside on by some 
distribution policy. The method data as well as the class and selector that this method 
represents are sent to this chosen node by the NEW.METHOD message. The format of a 
NEW.METHOD message is shown in figure 6.5. 

When a NEW.METHOD message arrives at a node, the NEW.METHOD message 
handler begins executing. It makes an object to hold the method, and copies the code from 
the message into the object. The NEW.METHOD handler then calls the InstallMethod 
method which takes the class, selector, and method ID and makes the bindings in the 
class/selector —*■ method ID data structures. 

Specification of the class/selector - method ID data structures has been ignored 
without attempts at subtlety. We do not have enough insight to definitely specify the best 
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Figure 6.2: Flowchart of the SEND Message Handler 
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Figure 6.4: A Coarse View of the Compiler/Machine Interface 
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NEW 

METHOD 

Routine 

Address 



Figure 6.5: Format of the NEW_METHOD Message 

rmat for these tables. We can talk a bit about the issues involved. (1) We should be 
able to take a class/selector word and efficiently And the corresponding method ID. (2) The 
table should be distributed around the network in a way to minimize bottlenecks. 

A reasonable way of doing this would be to apply some “bit-twiddling" function 
to the class/selector words to decide what node is responsible for knowing their bindings 
The actual data structures could be hashed, or perhaps each class would have an object 
that holds the method IDs for every selector. One annoying problem with any approach 

“ b00, - S,r W ta * We need to know how we can get to the data. Because of 

the added indirection through the LookupMethod and InstallMethod handler, we have the 
flexibility to try several approaches and test their performance in the future. 


6.4 Returning Values 

Return values can be sent with the REPLY message. This mearage take, the context ID 
to reply to, the slot number of the context to All, and one word of reply data. The reply 
data is passed by value if it is a primitive data word, or by reference if an object is to be 


returned. 
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6.5 Summary 


The class/selector calling model is a convenient mechanism for invoking tasks. By imple¬ 
menting it efficiently in the operating system kernel, we can guarantee an efficient implemen¬ 
tation. To provided extensibility, we provide hooks to the LookupMethod and InsertMethod 
handlers, so these routines can be reconfigured independently of the rest of the kernel. 
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Chapter 7 


Storage Reclamation in the 
Jellybean Machine 


But virtue, as it never will be moved, 
Though lewdness court it in a shape of heaven, 
So lust, though to a radiant angel linked, 
Will sate itself in a celestial bed, 
And prey on garbage 

— Shakespeare, in Hamlet I, V. 53 


7 .1 Introduction 


The successful performance of our machine relies on the fact that sufficient parallelism 
exists on the grain of methods. In order for this to happen, it is important that data- 
dependencies to shared objects are minimized, by adopting a more functional approach, 
where methods interact by value rather than by reference, a£ much as possible. This situa¬ 
tion promotes a large number of small, short-lived objects. Because of the minute amount 
of memory per each processing node, an efficient storage reclamation mechanism becomes 
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an important facet. The characteristics of our system, however, cause many straightfor¬ 
ward methods of storage management to break down, fn this discussion we will examine 
some of the important properties of the Jellybean Machine, and the ways these properties 
influence reclamation. The rest of this chapter provides a discussion of the issues pertaining 

to reclamation on the Jellybean Machine, and a possible fmt-cut at a garbage collection 
algorithm. 

7.2 Automatic Collection is Desirable 

Because the system is object oriented, and because we have a small memory with frequent 
allocations, object reclamation is important. Because objects can be shared in complex 
ways, and because of the high level programming model we wish to support, we wish most 
object deallocations to be handled automatically by a “garbage collector” that searches for 

objects that are no longer in use (i.e. there are no pointers to the object anywhere) and 
deallocates them when necessary. 

7.3 Choosing a Collection Approach 

Several characteristics of the Jellybean Machine will gnide us in the choice of garbage 
collection. Let’s remind ourselves of the character of the machine. 

7.3.1 Memory Organization 

Thn memory in a Jellybean processor is small, and it is local to that processor. Memory 
allocation is done in a simple contiguous manner. Compaction can be done in parallel 
very quickly. Memory objects am segment-based and are given unique object id's. In 
ion, these object id s are concatenated with a birth node number to provide a global 
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virtual address. The virtual to physical translation mechanism uses caching to improve 

name resolution, but this relies on locality. Random access to many addresses could be 
very expensive. 

7.3.2 Addressing System and Network Topology 

The Jellybean Machine uses a distributed memory to provide “site autonomy” [LS80] in 
order to perform local operations very fast, and avoid memory conflicts. But, the tradeoff is 
that foreign accesses will be very costly, involving a message send mechanism that is at least 
an order of magnitude slower. In addition, distributed memory can require synchronization, 
and the delays of network communication may make certain synchronization conditions 
impossible. The network may cause bottlenecks to occur if too many messages are sent to 
one place, and may hold data in transit. The network latency may also be a factor. 


7.3.3 Garbage Collection Character 

Garbage collectors take on various different characters. The common approach of reference 
counting collection doesn’t appear to be feasable in the Jellybean Machine because (1) 
it cannot collect cyclic data structures, (2) every pointer change will require a (possibly 
remote) object access, and (3) we are not always aware when “dead” pointers get changed. 
For these reasons, we decided to attempt some variant of a pointer chasing garbage collection 
mechanism. The next section describes the implementation of a pointer chasing garbage 
collector for our machine in some detail. 


7.4 A Pointer Chasing Garbage Collector 


There are several properties that we would like our garbage collector to have. 
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' MX'iissr sends - we d ° -— 

SwisMo^adlrage rf ^ "*• 

computation out of our processor as Dn«ihU q so ,, tilat we can squeeze as much 
situation where our mXe7u” for a whSi and SA"' woul „ d A *° «°W the 
garbage collection occurs. 311(1 then han « s U P fo r an hour while 


7.4.1 The General Idea 

Most of the work of pointer chasing garbage collection algorithms to date are targeted a. 
sequential or shared-memory machine, with large virtual memories. The standard algo 
nthm „ based on the copying collector proposed by Baker. This has be« expanded into 
incremental collector, and has been tuned to various object lifespan,, „i,h a good degree 
of success. Still, these approaches are targeted at a genre of machine of a radically differ¬ 
ent character tha, the J-Machine. With an admitted scarcity of knowledge in distributed 
collection, the rest of this chapter serve, only to sketch a simple vision of such a collector 
[Tot88], and some of the problems that are faced. 

A simple collector would involve recursive marking by message send,, and wouid 
compact the heap rather than by scavenging or copying, due to the small amount of memory 
per chip. The phases of this simple collector would be: 

DKire &£?o°f°. d o e dS RS* h “ 1 d ? ire *° -Itat. 

occurs on a time count. f n d h e run out of memory. Perhaps this 

setting any necessar^M^rfable? 6 ° bjects 3X6 marked unreferenced initially, as well as 
* tr " surti,,g a * root 
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7.4.2 Problems 

Synchronization and “Travelling References” 

A major problem in garbage collection across a communication medium is lack of synchro¬ 
nized, instantaneous transmission. This shows itself in garbage collection in a few ways. 
One of the more annoying problems is how to be sure that the last pointer to an object 
isn’t in transit when the garbage collector comes along. The garbage collector doesn’t see 
any pointers in the network, so an object may be deleted because a pointer was “travelling” 
between nodes where it can’t be noticed. We can refer to this as the travelling reference 
problem. Figure 7.1 shows a portion of a network of processors, where an ID of an object 
is in the network when the collector is run. 

An obvious way to resolve this situation is to prevent all upcoming message sends 
during collection, so that no other pointers are mailed into the network, and then to wait 
until all messages in transit have landed in a queue. We can tell when all messages have 
landed by either waiting a length of time we know to be longer than the maximum latency 
from the most distant nodes, or by sending “scout” or “bulldozer” messages down the 
network dimensions. When all these “bulldozer” messages arrive, they will have pushed all 
other messages out of the way, and the network will be empty. 

Problems With Disabling Sends 

In order to prevent the travelling reference problem, we have to 

• Disable sends so no new references enter the network. 

• Wait for all messages in the message in the network to land. 

But, we have no explicit mechanism in the MDP processing node to disable sends 1 . If we 

did, we could allow the processors to run until they tried to execute one of these disabled 

Or more preferably - a mechanism that would disable any sends that would cause a reference to be 
mailed into the network - all other messages could continue 
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Figure 7.1: Object ID Travelling in Network 
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instructions. When this happened, a fault could occur and some manner of process halting 
could occur (such as saving a context for the process for later re-starting 2 ). 

A possible way to resolve this problem at first might be to place guards in certain 
high-level execution handlers such as SEND and CALL. These handlers are run when a 
SEND or CALL message (two messages that ask a node to start executing a method) 
arrives. Inside these handlers we could have a guard that would defer the execution of 
the method until collection finishes. This goes a long way toward resolving the problem of 

travelling references if most the code that mails IDs around is code that is executed with 
CALL and SEND 3 

Another way to shut down the machine might be to disable the queue execution. 
This would cause messages to bwk-up in the queues. Certain messages that we would want 
to execute could be done by having the processor “walking” the queue by hand looking for 
certain types of messages (such as garbage coUection messages). It could also pull items 
out of the queue and into the heap to prevent queue overflow. 


Problems With Background Execution 


Since, at the start of garbage collection, we stop message sends by various possible mech¬ 
anisms, our concurrent machine is effectively shut down. This violates our desire for the 
collector to run in the background, in parallel with method execution. 


tion T Thl9 h ^i“’e C ^yskce^lM^ C thl ? r ?l bl< ;™ ° f wl uffi ?u nt ^ emory for a context alloca- 

the standard TJ*re m middle of collect ion. When there is not enough local memory, 

network whiiS^SMacSv^what 0 ^ 1 ^ location on a foreign node But this requires mailing referencesinthe 
, _ s exactly what we are trying to avoid. This underscores the difficulty present in providing 

efficient, convenient methods of prevent travelling references 

systemmii’aiffwhe^ £ 'Y' Ap “* ° A [ L “ d SEND me «ages, aU other messages are primitive 
system messages (where the system may have to be responsible for avoiding ID mailing during coUection) 

SEND M and *andlelunction returns. If we S of a CALL 

processor beins id£>®or th ? n * u .ard method will eventuaUv stop the machine, with every 

we must K waitl '“8 * 3 "Y** a function. This implementation has at least 2 requirements that 

Tiol^te th^? **/"“.* °j- (1) W f must msure ^ non-CALL and non-SEND messages must not 
violate the rules and mail references during garbage collection time. (2) Catastrophe can occur when we run 

out of memory trying to make contexts to hold the deferred execution requests. 
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In addition, the lad. of a register set for background mode prevents any way for the 
Message Driven Processor to take advantage of idle time in a reasonable way. Since any 
message would take priority over background mode, the register set will be trashed. Any 
computation done in background mode must shut off interrupts, which instead of taking 
advantage of idle time, takes advantage of application execution time! Some compromises 
can be made, such as having bad.gro.nd mode start up small unit, of computation by send¬ 
ing priority 0 messages, or by queuing up contexts of waiting-to-run background processes 
that are begun by a context startup message send when die background loop is entered. 
Again, various improvements should be examined. 


7.5 Summary 

The characteristics of the Jellybean machine necessitate a heap collector to redmrn storage 

Tfcs coliector may have to run often (since our nodes have such a small amount of memory) 

A reference counting appro«h s«m. to be out since there is a large overhead in changing 

■he object reference count, (and it is difficult to know when a reference is written over 

and thus deleted) as well as the fact that it cannot handle cyclic structures (if we insist 

that cyclic structures are illegal that results in a big lo» in term, of flexibility. If we don’t 

collect structures, w, will rapidly run out of memory). A pointer chasing collector has 

problems with tmuefim, te/erences (where the marker will not see the Anal reference to 

an object beca«« it is in a network - and thus delete the object), but seems to be the 

most vable approach. I, would be desirable to have the coilector run in the background 

without shutting the machine down, but the travelling reference problem seems to make 
this difficult. 



Chapter 8 

Support for Concurrent 
Programming Languages 

I get by with a little help from my friends. 
— John Lennon and Paul McCartney, in “A Little Help From My Friends” (1967) 

The Jellybean Machine Operating System Software provides several noteworthy 
services to support concurrent programming languages, both for functional and efficiency 
reasons. These include (1) the SEND and REPLY message handlers, (2) futures, (3) dis¬ 
tributed objects, and (4) the interaction interface. 

8.1 High-Level Languages 

8.1.1 CST 

Currently, the high-level language being used in the Jellybean Machine project is a Smalltalk- 
80 based language called CST (Concurrent SmallTalk) [DC]. CST uses a Lisp-like pre- 
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fix syntax, and codes sends implicitly in a fnnction application metaphor. CST allows 
asynchronous messages to exploit concurrency, and fully utilises the late-binding execution 
model. Locks are provided for explicit synchronization, and a “distributed object” data 
type exists to scatter object state over a large area. This CST code will be compiled to 
intermediate code which will is passed through a back end that converts the i-code to MDP 

machine code and loads i, into the system. The compilation and loading mechanism is was 
previously sketched in figure 6.4. 

The rest of this chapter describes several operating system services that support the 
execution of the object-oriented model of computation. 


8.2 SEND and REPLY 


As discussed in earlier chapters, the SEND message handler provides the machinery to run 

a method based on the class of a receiving object and the selector symbol “sent” to the 

object. In the current system, the SEND message may also describe one object to return a 

value to. This cstum-slor is specified by passing the ID of the object to hoid the returned 

value (the returned value must be one word, either a primitive value such as an integer or 

a symbol, or the ID pointer to the object), the slot (index into the object) number, and the 
node the object is on. 

The REPLY handler actually performs the return of the value. The REPLY - g . 

mails the target object ID, the target variable number, and the one word return value to the 
node number specified in the SEND message. When a REPLY message arrives a. a node, 
the returned value is stored in the indicated slot of the target object, and any processes 
warring for a variable to be filled by a reply are restarted. 
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8.3 Futures 


8.3.1 Conforming to Data Dependencies 

Data dependencies impose an order on execution. If a compntation result is used in a 
calculation, the result must be available before the calculation can occur. In a sequential 
processor, there is no problem. The instructions are ordered in such a way to insure that 
previous results are available in certain places before those values are needed. In a dis¬ 
tributed processor, on the other hand, a computation may take an indeterminate amount 
of time to complete on a remote node. Because of this, we may get to a point where a value 
is needed before the calculation of the value has completed. It is necessary to wait until 
this result returns before continuing the calculation. 


8.3.2 The Check’s in the Mail 

This section details a mechanism used prominently by the Jellybean Machine to impose data 
dependency orderings conveniently. The mechanism is quite simple. Whenever a calculation 
is spawned off in parallel, the destination location where the value of the calculation is to 
be stored is filled with a specially tagged value, called a context future, indicating that the 
value will arrive to the context in the future. When the calculation replies with the value, 
the future is overwritten with the real value of the computation. 

When an access is made to a location in a context, using the value located there, 
there is the possibility that the value hasn’t replied yet. We can tell if the value hasn’t 
returned yet, because it will be filled with a context future (c-future) if it hasn’t. Any read 
of a location containing a c-future will cause the processor to fault, (1) saving the processor 
state in the context object and (2) marking the context as waiting for a c-future. When a 

reply arrives to a context, the context is checked to see if it is waiting on a c-future. If so, 
it is queued to be restarted. 
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Advantages 

Disadvantages 

Simple 

Transparent 

Minimal Synchronization 

Large Inertia 

Parallelism Wasted 

False Restarts 


Table 8.1: Pros and Cons of Dependency Enforcement by Futures 


Let’s examine this context-future mechanism in a bit more detail to see what it 
really provides us and what deficiencies it faces. Table 8.1 itemizes some of the advantages 
and disadvantages of the future mechanism. 

8.3.3 Advantages 

As we said earlier, the most desirable characteristics of the c-future approach is that it is 
simple to implement and understand. It fits well into the existing system, being “opti¬ 
mistic” — taking advantage of the fault mechanism and the tagged architecture and using 
contexts. 

Being transparent to the programmer/compiler writer is desirable as well. No 
burden is placed on the code generator to explicitly keep track of non-completed tasks. 
No extra instructions need to be placed in-line to check for the presence of values, or to 
manipulate semaphores. 

Finally, the future approach only pays the price of synchronization if it is neces¬ 
sary. If a value returns before it is needed, or if an arm of a conditional is never executed, 
we will not need to pay the synchronization price 1 . 


‘Though we do require all replies to be in before we deallocate 


a context, so we can re-use context IDs. 
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8.3.4 Disadvantages 

On the other hand there are several disadvantages to this approach. The system is subject 
to high inertia. The total cost of halting and saving a context and restarting it when 
the return value arrives is relatively high. The worst case occurs when we have many 
dependencies following one after another. Here, we would keep halting and restarting, 
making very little progress. It can be difficult to gain any momentum, because of the time 
spent saving and restarting contexts. This case isn’t quite so bad if we have other tasks 
queued up that can take advantage of the free time, and if the replies take a while to 
arrive (which is likely to be the normal case). The real question is one of balance between 
computation time and system overhead time. 

By controlling execution on the grain size of methods, whenever a sequential exe¬ 
cution encounters a c-future value, the entire method will be suspended. Thus once we hit 
a c-future value, other possibly executable code in the method is not run. This is directly 
the result of basing the grain of parallelism on the unit of methods, and it has the effect or 
wasting parallelism as opposed to a more line-grain execution model. 

C-futures also can lead to a problem of false restarts where a reply for a different 
slot would restart the context, which would immediately halt on the same c-future again. 
If we were waiting on variable A to return and a reply to fill variable B arrives, the context 
would be restarted falsely, and when we read A we will hit the same future and halt again. 
This is rectified in the prototype implementation, by using the RESOURCE-NEEDED slot 
of the context to hold the slot number the context need to be filled. When a REPLY arrives, 
the context is only restarted if it was waiting on the slot the REPLY came to fill. 
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8.4 Distributed Objects 

A final system characteristic designed to support efficient high-level language execution is 
the introduction of distributed objects. A distributed object is one where its state is broken 
up into segments called constituent objects, and scatterred across the processing network. 
Its purpose is to allow parallel access to different parts of an object. 

A single object can only be directly accessed by the node it resides on, and the node 
it resides on can only run one task, implying that an object can only be computed on by 
one task at a time. In the absence of coherent caching strategies, this one-object—one-task 
constraint can potentially severely limit parallelism. 

By distributing parts of the object over several nodes we can provide some extra 
(albeit limited) concurrency. The hope is that this increase of concurrency along with the 
fact that an object-oriented programming model should provide access to many distinct 
objects being computed on at once will prevent object bottlenecks from becoming a serious 
performance hindrance. 

The system supports distributed objects by providing (1) allocation and (2) con¬ 
stituent lookup services. When a distributed object is allocated, the system creates con¬ 
stituent objects and scatters them in a reasonable way around the network. Each constituent 
object has a normal object ID number which is unique for each CO, and a distributed ID or 
DID which is the same for all constituents of a distributed object. This DID contains the 
information necessary to locate any constituent object. 

8.4.1 A Distributed ID Format 

Figure 8.1 shows a possible format for a distributed ID. The DID knows the number of 
constituent objects, the hometown node of the first object, and a node-unique serial num¬ 
ber. This prototype DID format places a limit of 256 COs per distributed object and 256 
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8-Bits 16-Bits 8-Bits 




NUMBER 



TAG 

OF 

HOMETOWN-NODE 

SERIAL 

CONSTITUENT 

("ROOT") 

NUMBER 


OBJECTS 



Figure 8.1: Distributed ID Format 


distributed objects per node. 

8.4.2 Dealing out the Constituent Objects 

When a distributed object is allocated, we want to have a function that maps each con¬ 
stituent object to a node number. This function should have several properties. It should 
be (1) easy to compute, it should (2) scatter objects in an acceptable manner. 

The goal of distribution is to provide concurrency, so with this aim as the measure of 
success, any distribution scheme would be equivalent. But, we need to take into account how 
the processor load is distributed around the network as well. There are two dichotomous 
goals of constituent distribution, (1) to scatter the objects uniformly across the network so 
there are no hotspots and (2) to scatter the objects locally to prevent long distance network 
traffic. 

Dispersion or Locality? 

These seemingly contradictory aims argue against each other. If we scatter objects uni¬ 
formly, especially if there are very few objects, the data may lie very far away from the 
majority of the computation. Even though some of the computation will migrate near the 
data and spawn from there, there still many be a great deal of network traffic caused by 
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(birthnode + nx stride) mod nodes 


Figure 8.2: Distribution of Constituent Objects 



the processes still proceeding from the root of the computation. In time, migration of wort 
may balance the load appropriately, but we still have worries about uniform distribution. 

On the other hand, if we dump the constituent objects dose together, the compute- 
tion will cluster around the data, and not hinder the performance of the rest of the network 

vta long distance traffic, but this local hotspot may overwhelm the computational resources 
of this local area of processors. 


A Simple Dispersal Approach 

The first design of the distributed object system leaves this question for further study, 

and adopts a simple, relatively disperse manner of dealing our constituent objects. We 

adopt a simple uniform distribution strategy hoping that the load balandng mechanisms 

incorporated into the system will work effectively. To insure the effidency of the calculation 

of the function, we use the simple distribution algorithm shown in figure 8.2. The node 

numbers we describe are a finite interval of numbers {n 6 N : 0 < » < node,} we might call 

ordinal node numbers and not the system network address node numbers which encodes the 

total addressing space of the network. The conversion between the two formats is simple. 

Figure 8.3 shows some sample distributions for various sized networks, birthnodes, and 
constituent object counts. 



4 by 4 Network 3 by 3 Network 
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Figure 8.3: Constituent Object Distribution Examples 
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l = cu rrentnod^-birthnode j x stride + birthnode 
r = [cun entnode-birthnode+stride j x stride + birthnode 

if / < birthnode then / = /—nodes mod constituents 
if r < birthnode then r = r—nodes mod constituents 
n = min(hops(currentnode,/), hops(currentnode,r)) 


Figure 8.4: Equations for Choosing a Nearby Constituent Object 


8.4.3 Choosing a Constituent Object 

We now have a first attempt mechanism to assign node numbers to each constituent object. 
Given a constituent object, we can find the node of its residence. For simplicity, we prevent 
constituent objects from being migrated. Now, we want to provide an algorithm to choose a 
constituent object given a DID. We could do this randomly, but in order to take advantage 
of locality, we want to choose a constituent object that is reasonably close to the current 
node. We do this by finding the ordinal node numbers of the constituent objects on either 
side of the current node number (/ and r for left and right) and choose the one (n) with the 
minimum distance in x-y hops. We have to be careful about “wraparound”. The algorithm 
is described in figure 8.4. 



Chapter 9 


Issues From a Prototype System 


Keep thy heart with all diligence; 
for out of it are the issues of life 

— The Holy Bible, Proverbs 4'-33 


This chapter discusses in some detail, relevant issues that occurred in the design and 
implementation of a prototype operating system. The following topics will be discussed 

• The sizing of the BRAT 

• How to handle a full translation table 

• The scarcity of virtual names 

• Out of memory problems 

• Queue size 

• Queues, stacks, and saving processor state 

These situations are troubling enough to require discussion. The actual prototype imple¬ 
mentation can be found in an appendix at the end of the thesis. Specifications of the system 
calls and message handlers can also be found in the appendices. 
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9.1 Sizing the BRAT 

To support the global virtual namespace, we use the Birth/Residence Address Table to 
hold the necessary translation bindings. This serves a purpose similar to a page table in 
a multi-level paged memory system, or a segment table in a segment addressable memory 
system. The BRAT needs to hold at least 

1. virtual —► physical mappings for objects residing on this node 

2. virtual -*• node number links for objects that were bom on this node, but now reside 
elsewhere 

9.1.1 Memory Limitation 

But, due to the small amount of memory on each chip, we face a severe restriction on 
the number of bindings that can be stored. Reserving room for system data structures, 
operating system variables, and the heap, we are left with a paltry amount of memory for 
the BRAT. This will directly limit the amount of objects creatable on a node. We must 
make a careful compromise between heap size and translation table entries. We must also be 
able to purge entries from the table when objects are deleted, stressing an efficient storage 
reclamation strategy. 


9.1.2 BRAT Use Scenarios 


Let’s take a look at a few possible scenarios that can occur with object management. 

1. There is room left in the heap and the BRAT for more objects to be allocated. 

2. There is room left in the BRAT but no more room left in the heap. 

3. The heap contains many small objects that don’t take up much room, but fill the 
BRAT, so that no more objects can be created. 

4- k ea P can be nearly empty, but no more objects can be allocated because the 
BRAT is full of entries of migrated objects. 
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The first case is the most desirable one, we wish we could have this happen all the time. 
The second case is undesirable, but will probably happen reasonably often due to the small 
memory space. This can be rectified by exporting objects to other nodes to free up heap 
space. The third and fourth scenarios, however, occur because of lack of translation table 
space due to the presence of large amounts of resident and/or migrated objects. It is these 
two cases that we would like to minimize. 

The prototype system that was developed assumed IK of RAM per node. Of this 
memory, 424 words were reserved for processor and OS data structures. Thus each processor 
is left with only 600 words to be shared between the heap and the translation table. The 
question that appears, is how to partition the BRAT and the heap in a reasnable manner. 

9.1.3 A Prototype Sizing Based On Average Object Size 

We have no measures as to object size in our system, but we might be able to suggest a 
reasonable approximation of, say, 10 words per object 1 . With 2 words of header for each 
object, this would leave 8 words of object space. So, each object would take up 10 words 
of heap space and 2 words of BRAT space, allowing ^ = 60 objects. But, we also need to 
reserve room for bindings of objects born on this node, but now residing elsewhere. Let’s 
assume that we pick a limit for this, such as the total number of average-size objects that 
could fit in the heap. This would allow us to migrate every object and STILL fill the heap 
with average sized objects. This leaves us with the following equations. 

heapsize + bratsize = freememory 
residentobjects = heapsize 

migratedobjects = residentobjects 
bratsize = 2 (residentobjects + migratedobjects) 

Though of course this will depend greatly on the type of program being run. 
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=> heapsize = f x freememory 
== > bratsize = 7 x freememory 

With 600 words of free space, this leaves the following parameters. 

heapsize = 428 
bratsize = 172 

In a 4K RAM node, we might expect the following configuration as a reasonable one. 

heapsize = 2552 
bratsize = 1020 


In the prototype operating system, the BRAT size has been set at 128 words, rather that 
172, for ease of implementation. 

9.2 Running Out of Binding Space 

Sooner or later, with even our best efforts at insightful sizing of the BRAT, we will run 

out of room to make any bindings. There are several conceivable ways of resolving this 
situation. 

1. Throw up your hands and quit. 

2 . Forward your allocation request to another node. 

3. Make the BRAT bigger. 

4. “Delegate” some of the bindings in the BRAT to another node. 

5 n ° deS ° f S ° me virtualaddresses to make other nodes responsible 

The current operating system implements choice 1 for the most part. There is also some 
code to support choice number 2 , but this is complicated by the fact that we might not be 
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able to allocate a context (as discussed in an upcoming section). If this mechanism could 
be made to work, it might be acceptable enough, realizing that any system will break when 
the nodes begin to run out of memory. The investment in a proper load-balancing policy 
may alleviate this problem. The operating system also supports the resizing of the BRAT, 
but because of the hashing mechanism currently used (described in an upcoming section) 
arbitrary resizing of the BRAT is difficult to do. 

The delegation of IDs is possible, but requires some thought. We need a way to 
specify which IDs are delegated to which nodes, and this should take significanly less storage 
than would be required to actually store the bindings. We could delegate ranges of IDs to 
a node, but this node must have room for the range, and when this new node runs out of 
room, it must also be able to delegate. This is a possibility for the future. The fifth item 
in the list, changing the birthnodes of virtual addresses would be very expensive requiring 
some synchronization, and a large broadcast of messages. But, perhaps this could be done 
during the garbage collection phase, or offline, or at the end of the day as a background job 
(given a suitably large machine). 

9.3 Scarcity of IDs 

As a related issue, given the virtual ID format of 16 bits of birthnode and 16 bits of serial 
number, each node can only generate 65536 IDs. In the current system, it is likely that 
many applications would run through this ID space in a fantastically short amount of time. 
Of course, Ihe time is dependent on the applications that are run, but we can sketch a rough 
estimate for how long we can run before running out of IDs on a node. 

The following calculations assume a 10MHz processing node where the average in¬ 
struction length is 1.5 cycles long. We assume that the queue is always full of work to be 
done. We assume that each message-spawned task work will be 200 instructions long (far 
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above the likely amount). We finally assume that only 10% of the tasks that come in will 
involve an allocation of an object. 


10 7 cycles 
second 


1 instruction 1 task 

■. x —- 

1.5 cycles 200 instructions 


x .1 


allocations 

task 


= 6667 


allocations 

second 


At this rate, a node would run out of IDs in 18 seconds. Though these numbers are 
questionable at best in the absence of actual measurements, it is quite clear that the ID 
space is compeletely inadequate. We have to have a larger virtual ID, say by having 68 bit 
words rather than 36 bit words, but in the meantime it might suffice to (1) borrow bits from 
the node number field or (2) attempting to re-use certain IDs. Borrowing bits would be a 
short time solution, by limiting our prototype machine to a IK machine, we could get a 64 
fold increase in serial numbers, allowing a node to rim for 20 minutes with the assumptions 
made above. But, for simplicity’s sake, the current implementation has not adopted this 
format. It would be a good idea to do this in the future until we build a machine with 
larger words. 


The second idea is a more interesting research issue. We already reuse context 
IDs by requiring contexts to have received all replies before they are put on the free list. 
This way, the amount of IDs reserved for contexts (probably the most frequently allocated 
object) is significantly cut. There may also be ways of reusing normal object IDs, but a 
space efficient way of noting these reused IDs may be difficult. Here are a few possible ideas 
on how to reuse IDs. 

1. Keep a fixed size table of free IDs. When an object is freed, the ID will be placed in 
the table. When an ID is needed, this free table will first be checked. The biggest 
problem with this approach, is that when the table fills, IDs will not be placed in the 
table and they will be “lost” forever. 

2. Provide a separate routine for allocating “short-lived” objects. These objects would 
take their IDs from a common, fixed-size pool of consecutive IDs whose freeness could 
be signified by a single bit for each ID. For example, we might reserve 256 “short¬ 
lived’ IDs per node. The short-lived IDs’ serial numbers might range from 0 to 255 
and the pool could be represented by 8 32 bit words signifying an array of 256 bits, 
where a 0 indicates the ID is in use, and a 1 indicating that it is free. If these objects 
are truly short-lived, and they represent the bulk of ID requests, then this approach 
might greatly extend the lifetime by conserving regular IDs. 



CHAPTER 9. ISSUES FROM A PROTOTYPE SYSTEM 


89 


3. Every now and then, perform an ID “garbage collection and compaction” where all 
IDs are renamed to consecutive IDs in effect compacting the ID space. This involves 
similar issues to the mechanism of changing an ID’s hometown node number. It seems 
to be very expensive, but it may be possible to interleave this with the normal garbage 
collection. 


The currently implemented mechanism only reuses context IDs (a fixed amount). No at¬ 
tempt is currently made to reuse other object’s IDs. 


9.4 The Shortage of Memory 

Of course, the scarcity of memory per node will also prove to be a problem. The goal 
is to take advantage of the large collective memory provided by the system (a 4096 node 
J-Machine with 4K memory per node would have 16 megabytes of primary memory). Load 
balancing can be used not only in choosing processors to perform work, but also in choosing 
nodes to allocate memory from. Simple gradient plane approaches [RF87] can be used 
to cool down memory “hot spots”. Garbage collection, expanded memory nodes, and the 
sweeping of “dusty” objects to offline storage are all possible solutions to the memory 
shortage problem. 

The current prototype operating system kernel takes two approaches to memory. 
If a message arrives to allocate an object, and there is not enough memory available, the 
message is forwarded to another node. However, if a process has been running for a while 
and the node runs out of memory, the calling message cannot simply be forwarded, since 
some work has already taken place. Instead, the process must have its state saved in a 
context, and room must be made on this node by evicting certain objects. Unfortunately, 
there might not be enough memory to allocate a context. A solution out of this trap is to 
require that there always be one minimal sized context object available for each priority 
level. A check could be made in the CALL and SEND handlers (and any other message 
handlers that could fall into these circumstances) for a free context. 
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9.5 Queue Size 

Queue sizing also proves to be a problem in the system. Since we want to be able to migrate 
objects by message sends, an empty queue must always be big enough to hold every object. 
This means that the queue must be as big as every heap. This is far too costly in terms 
of memory in the IK node prototype, and we have not attempted to make a fix. It would 

always be possible, though admittedly tedious, to send messages in “chunks” that would be 
able to fit in the queues. 


9.6 Suspension and Processor State 

Whenever a process suspends and plan on restarting later, it must be able to save its 
processor state. This normally means its register set, but we must not forget about two 
other forms of processor state, queues and stacks. When we suspend and there is a message 
we want to save in the queue, we copy it out into a heap object and set the message pointer 
to point to the object instead of the queue. Stacks are more of a difficulty to save and 
restore, and we have decided to explicitly prohibit the saving of stack frames. So, the 
operating system is given the task of insuring it will never have to suspend and restart 
with information on the stacks. This was a source of much personal misery during the 

implementation of the OS (though certainly less than there would have been without the 
existance of stacks). 


9.7 Summary 

This chapter has touched on just a few of the difficulties in the design of the Jellybean 
Operating System Software. Some are due to inadequacies in hardware or scale, some are 
due to lack of behavioral measurements, and some due to lack of insight. These will most 






This empty page was substituted for a 
blank page in the original document. 



Chapter 10 


Performance Evaluation 


Never promise more than you can perform. 
— “Publilius Syrus”, Maxim 528 

This chapter provides a quantitative performance evaluation of several important 
system services. Though the prototype implementation is certainly not optimal in any way, 
it should be a reasonable approximation of an actual working operating system kernel, and 
as such, the numbers presented in the chapter should be useful for the design and tuning 
of the rest of the Jellybean system. In addition, we should be able to see what parts of the 
system need fixing, before the machine is fabricated. 

10.1 The Virtual Binding Tables 

The virtual name manager is composed of five system routines nested in the hierarchy 
shown in figure 10.1. The BRAT itself is composed of a 128 word binding table of 64 2- 
word bindings. Words are entered by a linear probing [Sed83] scheme where a hash function 
determines the first choice for the location of the binding, and a linear search is performed 
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Figure 10.1: The Hierarchy of the Virtual Name Manager 


from there. This linear search can take a significant amount of time (at least on the scale 
of average task size), so we aeed (1) an efficient algorithm and (2) a successful hashing 
scheme. The remainder of this section examines the execution time of each BRAT routine 
and presents some very preliminary hashing measurements. 

10.1.1 Instruction Counts 

The BRAT .PEEK system call is the core to all of the virtual name services. It takes a 
key to hash and a data word to match (not necessarily the same, since you might want to 
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look for the first NIL slot where a certain key could be placed, as is done when adding new 
entries). The key is hashed, providing the index into the table, and a linear search with 
wraparound proceeds from here. The cost of this call is between 22 and 540 instructions, 
based on how far the search has to progress. A reasonable cost approximation, C peek , for 
a search that finds the data in the n th slot is 22 + 8 x (n - 1) steps. 


The rest of the BRAT calls utilize this BRAT .PEEK routine. 

• BRATJCLATE looks up a binding in the BRAT and takes 27 + C k steps to com 
plete. p 








BRAT_PURGE searches the BRAT until it finds the first binding of the specified 
word, and removes it from the table. This takes 30 + C peek steps to complete. 

BRAT_ENTER_NEW adds a new entry to the BRAT without first removing any 
previous bindings. It accomplishes its task in 32 + C peek steps. 


Th e most expensive routine, potentially, is the BRAT-ENTER routine. This is 

dd i but it first removes a previous binding, requiring another 

BRAT search. This can take as much as 32 + 2 x C peek steps. 


10.1.2 Effectiveness of Linear Probing 

Evidently, the crucial factor in the effectiveness of the BRAT routines is the cost of peeking 
through the BRAT, C peek , which is a linear function of how far away from the expected hash 
spot the value resides. What the average distance in hash steps will be for a typical machine, 
depends greatly on (1) the application that is being run, (2) how storage reclamation is 
handled, (3) and what is done when the BRAT overflows — all issues needing further 
study. Nonetheless, I would like to proceed with an informal, ad hoc analysis, based on 
reasonable estimates and educated guesswork. The rationale is to see if the linear probing 
strategy seems to generally work — by that, meaning that the average number of steps is 
small until the entry is found 1 . 

It is not obvious that this will so. In fact, it is quite easy to be concerned that this linear rehashing 
approach might actually work itself into a steady state where entries were always very far away from where 
they were supposed to be. 
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The following data was generated by a simulation program called bratsim that takes 
an input pattern of references and simulates their effect on the BRAT. The size and max¬ 
imum fullness of the BRAT is specifiable. The simulator takes each reference and looks it 
up in the BRAT. 

• If the reference is in the BRAT, it records the number of steps away from where it 
should be. 

• If the reference is not in the BRAT, it is entered as soon as possible after its hashed 
spot< 

• )^ en names get entered, some may be arbitrarily deleted to maintain a maximum 
full percentage. 

• If the BRAT fills, a random slot will be emptied. 

The reference pattern generator is also based on initial approximations, generating patterns 
possibly likely in applications we envision running. It is currently configured with the 
following parameters: 10% new IDs, 20% context IDs, 35% recent IDs to simulate locality, 
20% less local IDs, and 15% very random IDs to simulate class/selector bindings, method 

IDs and other references following less of a pattern. I would expect this estimate to be 
conservative. 

Based on these estimates, and the reclamation model presented above, we can chart 
how many steps away from the hashed slot particular IDs land when they are entered. For a 
64 word table, this is graphed in figure 10.2. We see an asymptotic function relating BRAT 
space used and the locality of entries to their intended slots. For the 64 row example, the 
system begins to be unmanageable after the BRAT becomes more than 60 - 70% full. 

Figure 10.3 shows the effect of doubling the BRAT size. The trend is still rapidly 
increasing, but the gains we get in terms of object storage may outweigh the extra steps 
involved in lookup. The flatness of the middle portion, from 40 - 60% hints at a desirable 
operating region. 

So, now I would like to suggest educated guesses to the answers to the following two 
questions. 
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Maximum Percentage of BRAT Space Used (64 Rows) 


Figure 10.2: 64 Row BRAT Enter Distances from Hashed Slot 
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Maximum Percentage of BRAT Space Used (128 Rows) 


Figure 10.3: 128 Row BRAT Enter Distances from Hashed Slot 
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1. How full should we allow the BRAT to get? 

2. How large should the BRAT be? 

In the last few paragraphs, I indicated the severity of the BRAT filling problem. After 70% 
capacity, the BRAT’s performance becomes intolerable. For this reason, I suggest that 70% 
capacity should be an absolute maximum for BRAT size, and the normal operating size 
should not usually exceed 50%. I propose this as the answer for question 1. 

Question number 2 can be answered by adapting the analysis presented in the last 
chapter. The new constraint equations become. 

heapsize + totalbratsize = freememory 
residentobjects = ^ ea P ) s ^ ze 
migratedobjects = residentobjects 
bratspaceused = 2 (residentobjects + migratedobjects) 
bratspaceused = .7 x totalbratsize 
=> totalbratsize = ^ x freememory 
=> heapsize = jj x freememory 

With 600 words of free space, this reserves 218 words for the BRAT and 382 words for the 
heap. This will hopefully be a more accurate value, though it is not a power of 2, which 
will complicate the hashing slightly. 

The efficient manipulation of the BRAT is crucial to the success of the Jellybean 
system. Future study is needed to evaluate hashing functions, and perhaps a form of linear 
re-hashing is desired, where the first hash is followed by a subsequent number of other 
hashes instead of a linear search. In addition, once real applications are run, we can get a 
better idea how the system will behave. Likewise, the translation buffer performance needs 
analysis, as this will indicate how often BRAT lookup occurs. 
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10.2 Object Allocation 

A common task of the Jellyban Operating System Software is to allocate objects from the 
heap. This section will examine how costly this operation can be. 

Figure 10.4 describes the nesting of services required to perform the NEW system 
call. The ALLOC routine takes 24 instructions, it takes 19 instructions to generate a new 
ID and it takes 32 + C peek instructions to enter a new ID into the BRAT. With 20 cycles 
for inter-module glue, the NEW system call takes 95 + C peek instructions. According to 
the BRAT analysis results, if we operate at less than 70% full, we will have to take less 
than 10 steps to enter a new ID, this would indicate that C peek = 94 steps and therefore, 

NEW should take 95 + 94 = 189 instructions. At best, with 0 steps to search, the NEW 
call would take 117 steps. 

10.3 Context Allocation 

Another commonly executed routine is the NEW.CONTEXT system call. As described in 
chapter 5, this service was expected to be expensive enough to merit special treatment. The 
context free list was developed to provide a pool of pre-allocated contexts for fast context 
allocation. The flowchart in figure 10.5 shows the steps talcen by routine. Note that if the 
requested context is of an abnormal size, or if there are no pre-allocated contexts on the 
free list, the NEW routine is called to allocate a new object. Requesting an abnormally 
sized context takes 25 + C new instructions, allocating a context when node are on the free 
list takes 27 + Cnew instructions, but allocating a context off the free list takes only 20. If 
we can keep contexts in the pool, we will do well. 

Freeing contexts is also fast, taking only 25 instructions. This is only about 10% 
of the time it used to take to perform this operation, when we were required to purge the 
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Figure 10.4: Nesting of Services for the NEW System Call 
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Figure 10.5: Flowchart for the NEW.CONTEXT System Call 
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old context ID, generate a new one, and place the new ID in the context and BRAT. By 
preventing late replies to contexts, we have prevented this performance loss. 

10.4 Boot Code and Message Handlers 

Let’s conclude the chapter with a brief discussion of the complexity of the Bootstrap code 
and several message handlers. The boot code is run when each processor is powered up, 
and places the processor in a runnable state. All together, it takes 5005 steps to boot the 
processor. This is made up of 4103 steps to erase the memory, 481 steps to initialize the 
context free list with 3 contexts, 247 steps to fill the exception vector table, 86 steps to fill 
the extended call table and 72 steps to set up the stacks, queues and other values. 

The WRITE message handler takes 8 + 7 X / + 3 steps to send l words of data. The 
READ message handler takes 8 steps to read an empty message, or 7 + 5 x (/ - 1) steps to 
read a block of data of length /. 

The CALL message handler can exhibit several possible times. If the method being 
CALLed is local, it only takes 6 instructions to start it executing. If the method is local, 
but not in the cache, it takes 64 + C peek steps, because the XLATE exception handler 
takes 58 + Cpgefc steps to complete. If the method is not local, message sends are involved 
making it more difficult to analyze. 

10.5 ROM Size 

Out of the 1024 words reserved for ROM, the operating system prototype uses 760. 
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10.6 Summary 


This section presented a brief performance evaluation of several important parts of the 
Jellybean system. In addition to analyzing the cost of routines, several more fundamental 
issues were noticed. These are itemized below. 


* Th . e BBAT ? e ® ds to be searched efficiently. The linear probing method used can take 
a significantly long time if values get placed far from their intended position. 

# dd it ° n P reliminar y simulation, the performance becomes unacceptable when the 
u dd f t0 j to ^ P ercent we can choose a maximum fullness, and derive 

the BRAT and heap sizes based on the fullness value and the expected size of objects. 

• We note that even with an insightful configuration of the BRAT, a translation cache 
is required. The configuration of the cache is left to further study. 


• Creating a new object is more expensive than we would like (a minimum of 117 instruc¬ 
tions). This could be optimized with clever coding, but not much more performance 
could be gained by this manner. The problem is more fundamental resting on the 
performance of the cache and the BRAT lookup. 6 


• The caching of free contexts seems to work well. Creating a new context requires 
only 20 instructions if there is a context on the free list (and assuming we don’t get 
a translation fault). This is compared to a minimum of 144 instructions without a 
context on the free list. Freeing a context is also fast, only 25 instructions. 

• Calling a local method takes only 6 instructions if the method is local and its trans¬ 

lation is in the cache. If it is not in the cache, performance again suffers, requiring a 
minimum of 86 instructions. ® 


Table 10.1 summarizes some of the more important performance statistics presented in this 
chapter. 
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Routine 

Instruction Count 

Notes 

BRAT .PEEK 

^peek = 22 + 8 x (n - 1) 

n = slots to search 

BRATJCLATE 

27 + C peek 


BRAT_PURGE 

30 + C peek 


B RAT _ENTER_NEW 

32 + W 


BRAT_ENTER 

32 + 2 x C peek 

maximum 

ALLOC 

24 


GENID 

19 


NEW 

95 + C peek 


NEW.CONTEXT 

20 

with context on free list 


27 ^ ^peek 

no context on free list 

FREE.CONTEXT 

25 


CALL.MSG 

6 

with method ID in cache 


64 + C peek 

method ID not in cache 


Table 10.1: Timings for Common System Services 
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Chapter 11 


Conclusions 


All’s well that ends well 

— Shakespeare, in All’s Well That Ends Well IV 

There is a time for many words, 
and there is also a time for sleep. 

— Homer, in The Iliad, XI 


11.1 Summary 

The Jellybean Operating System Software is a prototype operating system kernel for the 
Jellybean Machine. Its duties include object-based storage allocation, virtual distributed 
naming, object migration, process definition and control, local and remote process execu¬ 
tion, and the support of an object-orient calling model. 

This thesis described the JOSS in some detail, its successes and weaknesses. The 
report also talks about issues in the future Jellybean operating system that were not imple¬ 
mented in the prototype because of lack of support, study and time. These include storage 
reclamation, resource distribution bureacracies, and distributed objects. These will most 
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likely become important parts of the Jellybean operating environment in the future. 

Several deficiencies may exist in the current system. Performance-wise, searching 
the translation table may well be too slow. Several solutions can be proposed including (1) 
increasing the size of the BRAT and decreasing the fullness, (2) experimenting with various 
hashing functions and (3) providing an effective translation buffer. Memory shortages may 
provided a significant problem, and this will place an extra burden on reclamation attempts, 
which are already made difficult because of the problem of travelling references. 

On the other hand, if the cache works well, and if the BRAT is not very full, the 
whole system seems to perform admirally. Method invocations are powerful but fast. The 
context free list allows rapid creation and reuse of contexts. The global naming system and 
migration provides a high degree of flexibility. 

11.2 Suggestions for Further Study 

This thesis scratched the surface of many interesting research issues, many of which I for 
one would be eager to investigate. 

In the area of performance evaluation, the configuration and simulation the transla¬ 
tion buffer and BRAT in a real life environment is important to the success of the Jellybean 
Machine. Also of practical as well as theoretical interest would be the study and evaluation 
of distribution hierarchies and the various manifestations of how to handle virtual hints. 

Reclamation is an important potential area of research. An efficient mechanism to 
collect garbage over a distributed network would be of general interest as well, especially if 
some incremental form of collection can be developed. Policies for handling out of memory 
conditions on processing nodes is also attractive, involving selective migration of objects. 

Finally, load and resource balancing policies need to be investigated, especially since 
each processor can quickly become overwhelmed (being limited in power and memory ca- 
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parity). Simple gradient plane approaches might be attempted where load spreads to where 
it is lower. Network analysis will also be an important factor. 


11.3 Hopes 


The Jellybean Machine has the potential of being an important step in the development of 
multicomputer networks. It is my hope that further study will be encouraged so that the 
difficulties of machines of this genre can be resolved (memory shortages, expensive name 
translation, no caching of mutable objects, need for resource balancing, etc.) and they can 
show their benefits as scalable, programmable processors. 
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OS.MOP 

This file contains operating system labels & stuff 


: 

Useful system values 


LABEL 

SYSJ.EN BITS 

- 10 

LABEL 

SYS_LEN MASK 


LABEL 

SYS_I0 NOOE BITS 

- 16 

LABEL 

SYS ID ID BITS 

• 16 

LABEL 

SYS ID ID MASK 

• *111111111111 

LABEL 

SYS_IO_NOOE MASK 

■ *111111111111 

LABEL 

SYS.CLASS MASK 

• *111111111111 

LABEL 

SYS CLASS BITS 

- 16 

LABEL 

SYS_S£ LECTOR MASK 

■ *111111111111 

LABEL 

SYS_SELECTOR_BITS 

• 16 

LABEL 

SYS OPO BITS 

- 7 

LABEL 

SYS OP1 BITS 

- 2 

LABEL 

SYS OP2 BITS 

• 2 

LABEL 

SYS OPO MASK 

• *1111111 

LABEL 

SYS UNCHECKED 

• (1«31) 

LABEL 

SYS UNC 

« SYS UNCHECKED 

LABEL 

SYS AOSHAOOW 

• (1«8) 

LABEL 

SYS ABS 

• SYS AOSHADOU 

LABEL 

SYS INVADR 

- (1«30) 

LABEL 

SYS MARK MASK 

■ (1«31) 

LABEL 

SYS COPY MASK 

■ (1<<30) 

LABEL 

SYS REL MASK 

* <1«31) 

LABEL 

SYS_UNMOVABLE_MASK 

• (1«29) 

* 

* 

XLATE Modes 


LABEL 

XLATE.OBJ 


LABEL 

XLATE_I0 TO NOOE 

■ 1 

LABEL 

XLATE METHOO 

■ 2 

LABEL 

XLATE.LOCAL 

• 3 

} 

Temporary locations 



LABEL 

TEMPO 

- 0 

LABEL 

TEMPI 

■ 1 

LABEL 

TEMP2 

- 2 

LABEL 

TEMP3 

■ 3 

LABEL 

TEMPA 

■ 4 

LABEL 

TEMPS 

- 5 

LABEL 

TEMPS 

» 6 


Memory Map 


LABEL 

OS_PO_TEMPS_0ASE 

• 0 

LABEL 

OS_PO_TEMPS LENGTH 

* 8 

LABEL 

OS_P1_TEMPS BASE 

■ 8 

LABEL 

OS_P1_TEMPS LENGTH 

« 8 

LABEL 

OS_EVECTORS BASE 

- 16 

LABEL 

OS_EVECTORS LENGTH 

• 46 

LABEL 

OS_PO_STACK BASE 

• 64 

LABEL 

OS_PO_STACK LENGTH 

* 32 

LABEL 

OS_P1_STACK BASE 

* 96 

LABEL 

OS_P1_STACK LENGTH 

• 32 

LABEL 

OS.QUEUEO BASE 

* 128 

LABEL 

OS QUEUEO MASK 

■ 127 

LABEL 

OS CACHE BASE 

• 256 

LABEL 

OS CACHE MASK 

- 63 

LABEL 

OS_QUEUE1 BASE 

« 320 
























LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 


LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 


LA8EL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 


LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 

LABEL 


0S_QUEUE1_MASK 
OS VARS BASE 
OS VARS.LENGTH 
OS MCACHE.BASE 
OS _ MCACHE_L£NGTH 
OS XVECTORS.BASE 
OS _ XVECTORS_LENGTH 

os'locked.base 

OS LOCKED.LENGTH 

OS - INITIAL_BRAT_LENGTH 

OS~INITIAL.BRAT.MASK 


31 
352 
16 
376 

32 
408 
16 
424 
0 

128 

(OS.INITIAL.BRAT.LENGTH-1 )4(-1) 


Locations of OS Variables 


VAR FREETOP 
VAR_8RAT BASE 
VAR BRAT LENGTH 
VAR_8RAT HASH MASK 
VAR ROM START” 

VAR NEXT ID 

VAR_LAST - ID 

VAR_MCACHE_BASE 

VAR_MCACHE_LENGTH 

VAR_MCACHE OVERFLOW LIST 

VAR_CFREE_LIST 

VAR.HEAP BASE 

VAR.NET.VIOTH 

VAR_NET_HEIGHT 


OS.VARS BASE ♦ 0 
OS_VAAS_BASE ♦ 1 
OS VARS BASE + 2 
OS VARS BASE * 3 
OS VARS BASE * 4 
OS.VARS BASE ♦ 5 
OS.VARS BASE ♦ 6 
OS.VARS.BASE ♦ 7 
OS VARS BASE * 8 
OS VARS BASE ♦ 9 
OS.VARS.BASE ♦ 10 
OS VARS BASE * 11 
OS VARS BASE ♦ 12 
OS VARS BASE + 13 


Tag Values 


TAG.SYM 

• 0 

TAG I NT 

■ 1 

TAG.BOOL 

• 2 

TAG.AOOR 

• 3 

TAG IP 

• 4 

TAG.MSG 

• 5 

TAG A 

" 6 

tagIb 

• 7 

TAG C 

• 8 

TAG 0 

• 9 

TAG E 

■ 10 

TAG F 

• 11 

TAG CS 

■ TAG 0 

TAG.OBJHEAD 

• TAG E 

TAG OBJ 10 

- TAG.F 

TAG.INSTO 

• 12 

TAG INST1 

■ 13 

TAG 1NST2 

* 14 

TAG INST3 

■ 15 


Exception Vector Locations 


EVECTORBASE . OS.EVECTORS.BASE 


FAULT BKGD 
FAULT.OBLFAULT 
FAULT.ILGINST 
FAULT.ILGAORMO 
FAULT.ACCESS 
FAULT.EARLY 
FAULT LIMIT 
FAULT INVAOR 
FAULT MSG 
FAULT.QUEUE 
FAULT SENO 
FAULT XLATE 
FAULT.RANGE 
FAULT PUSH 
FAULT POP" _ 
FAULT OVERFLOW 
FAULT TYPE 
FAULT IA 
FAULT IB 
FAULT IC 
FAULT ID 
FAULT.IE 
FAULT IF 


EVECTORBASE 



EVECTORBASE 

♦ 

1 

EVECTORBASE 

♦ 

2 

EVECTORBASE 

♦ 

3 

EVECTORBASE 

♦ 

4 

EVECTORBASE 

♦ 

5 

EVECTORBASE 

♦ 

6 

EVECTORBASE 

♦ 

7 

EVECTORBASE 

♦ 

8 

EVECTORBASE 

♦ 

9 

EVECTORBASE 

♦ 

10 

EVECTORBASE 

♦ 

11 

EVECTORBASE 

♦ 

12 

EVECTORBASE 

♦ 

13 

EVECTORBASE 

♦ 

14 

EVECTORBASE 

♦ 

16 

EVECTORBASE 

♦ 

17 

EVECTORBASE 

♦ 

18 

EVECTORBASE 

♦ 

19 

EVECTORBASE 

♦ 

20 

EVECTORBASE 

♦ 

21 

EVECTORBASE 

♦ 

22 

EVECTORBASE 

♦ 

23 


Classes 

sis**ss 


LABEL CLASS_CONTEXT 


1 


















LABEL 

CLASS METHOO 

m 

2 

LABEL 

CLASS MESSAGE 

m 

3 

LABEL 

CLASS.INT 

a 

512 


System Call Values 



LABEL 

TRAP NEW CONTEXT 

a 

0 

LABEL 

TRAP.FREE CONTEXT 

a 

1 

LABEL 

TRAP XFER 10 

a 

2 

LABEL 

TRAP XFER AOOR 

a 

3 

LABEL 

TRAP 10 TO NODE 

a 

4 

LABEL 

TRAP NEW 


5 

LABEL 

trap malloc 


$ 

LABEL 

TRAP GENID 

a 

7 

LABEL 

TRAP^VERSION 


8 

LABEL 

TRAP BRAT PEEK 


9 

LABEL 

TRAP SWEEP 


10 

LABEL 

TRAP_FREE_SPECIFIEO_CONTEXT 


11 

LABEL 

TRAP XCALL 



LABEL 

TRAP_DIE 

a 

15 






Extended Call Values 



LABEL 

XCALL.BRAT ENTER 


1 

LABEL 

XCALL_BRAT XLATE 


2 

LABEL 

XCALL_BRAT PURGE 


3 

LABEL 

XCALL.MIGRATE OBJECT 

a 

4 

LABEL 

XCALL_BRAT_ENTER_NEW 

a 

5 

• 

s 

Object Field Offsets 



LABEL 

OBJECT HOR 



LABEL 

OBJECT.10 

a 

1 

LABEL 

CONT.PSTATE OFFSET 


2 

LABEL 

CONT.NEXT CONTEXT 


3 

LABEL 

CONT.RESOURCE 

a 

4 

LABEL 

CONT_NORMAL_SIZE 

a 

13 

LABEL 

PSTATE.IDO 



LABEL 

PSTATE 101 



LABEL 

PSTATE IDE 



LABEL 

PSTATE 103 



LABEL 

PSTATE RO 



LABEL 

PSTATE R1 



LABEL 

PSTATE RE 



LABEL 

PSTATE R3 



LABEL 

PSTATE.IP 

a 

8 

LABEL 

CONT.PSTATE.SIZE 

a 

9 

t 

» 

! 

Handler IDs 



LABEL 

HANDLER.INSTALL METHOO 

a 

TAft na.im.ft 

LABEL 

hanoler.lookup.methoo 

a 

TAG_08JID:1 
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Designed end Implemented by the members of the Concurrent 
VLSI Architecture Group at the Massachusetts Institute of 
Technology. 

Copyright (C) 1986, 1987 Massachusetts Institute of Technology 
ALL RIGHTS RESERVED 

No copy of this source code may be made by any means, 
electronic or otherwise, without prior permission of 
the Massachusetts Institute of Technology. 


; ROM.MOP 

; This file contains system kernel routines for the MOP ROM 

; Edit History (started 6/23/87) 

; Who Date What 


Brl 6/23/87 


Brl 6/24/87 

Brl 6/26/87 


Brl 6/29/87 
Brl 6/30/87 
Brl 7/06/67 
Brl 7/09/87 

Brl 7/10/87 
Brl 7/13/87 
Brl 7/17/87 


Brl 7/28/87 
Brl 8/05/67 
Brl 8/10/87 


Brl 8/11/87 


Brl 8/12/87 

Brl 2/05/68 

Brl 2/10/88 

Brl 2/16/88 

Brl 2/19/88 

Brl 2/22/88 

Brl 3/04788 

Brl 3/08/88 

Brl 3/16/88 

Brl 3/18/88 


Added STAT_x labels. Added R0M_SIZE 
calculations. Changed temporary use to 
avoid bashing In conjunction with dependency 
graph, and larger temporary space. Fault 
handlers now use FTEkPs Instead of TEMPs. 

New trashing specification to make variable 
use clearer. 


More work on method mode of XLATE EXC. 

Stack tasting code 8 boot Initialisation. 
Started converting trap routines from TEMP 
to stack conventions. 

Continued converting to stack conventions 

Removed stack conventions 

Inserted stack conventions 

Started conversions to V8, Including the 

new register Instructions 

Continued conversions 

Put some Initial garbage collection attempts In 
Put In BRAT manipulation traps. We need more 
trap vectors for system calls. So, add a 
system call location to use another table 
sometime. 

Switched to version 9. 


upgraoeo xlate EXC 

Finished code for XLATE_EXC 8 method caching, 
but haven’t tasted It yet. Fixed some bugs 
In the BRAT manipulators. 

Tested XLATE_EXC 8 method caching code. 

There Is a bug after the METHOD REQUEST REPLY 
that causes a MSG fault. I think that the 
*THOO_REOUEST_REPLY message has a length that 
is maybe 1 too small, so when the 
RESTART_CONTEXT massage arrives, the last 
word of the previous message Is used as the 
message headerTTT Also updated os.mdp file. 
Fixed the method caching length-of-message 
problem. Made XFER restore data registers 
and ID registers, and not try to reXLATE 
A0 If It’s ID register Is nil. 

Modified context format to move processor 
state to the end. Updated 08.MOP. 

Added FREE_CONTEXT_TRP 8 FREE_COMTEXT_MSG. 

Fixed OS.H OP t hat has 08 vars In wrong place 
Added NEV_METHOO_MSG. ID TO NOOE TRP, 
placed local XLATE In XLATElEXC (for ID TO NODE 
** *•'' other simple uses of XLATE),“wrote 
5END_M5G. 


Made XFER free contexts. Fixed up SWEEP TRP. 
Finished 8 tested heap compactor. 

Changed ID_TO_NOOE_TRP, Removed XLATE RCVR 
mode - replaced with XLATE.LOCAL and Tn-llne 
code within SENO_MSG. Added XLATE ID TO NOOE 
mode to XLATE. ~ - - 


oc '" a oown region to memory map. Made 
LOCKHEAP equivalent to PUSH I,MOVE TRUE,I and 


UNLOCKHEAP to POP I. 


Added method cache overflow list support 
Added extebded system call mechanism. 

Added “copy* bit to method headers. Cached 
methods are now distinguished by this copy 
bit rather than using the method directory 
also for this purpose. Started INVAOR EXC 
handler. 



















; BOOT -- This routIns contains ths cold boot MOP coda 


MOVE 

MOVE 

MOVE 

_BOOT CLR: 

BZ 

SUB 

MOVE 

BR 

.BOOT.CLROONE: 


; Find how much RAM ws have 


; Clear memory 


R1BOOT CLROONE 

R1.1.R1 

RO.CRI.AO] 

~ BOOT CLR 


; This is a hack to rill 
i RO with the amount of RAM 


; Copy amount of RAM to R1 
; Also copy to R2 


; If loop done, break out 
; Decrement R1 
; Stick NIL In address 
; Loop 


; Save the RAM site In the OS variable, now that RAM Is clear 


VAR ROM START 
R2, C RO, AO ] 


; RO <- Offset to ROM START var 
; VAR_ROM_START <- 1st ROM loc 


_BOOT_EXCV: 

DC 

MOVE 

OC 

MOVE 

OC 

_BOOT_EXCV LOOP: 


.BOOT XCALLV: 
OC 

MOVE 

OC 

MOVE 

DC 


_BOOT.XCALLV.LOOP 


; Set up exception vectors a xeall vectors 

ADOR: (EXC_VECT0R5«SYS_LEN_8ITS) |0S EVECTORS LENGTH 
R0,A1 ” ” 

ADOR: (OS_EVECTORS_BASE«SYS_LEN_BITS) | OS EVECTORS LENGTH 
R0.A2 “ “ 

OS.EVECTORS.LENGTH 

Re.^BOOT XCALLV 
RO.I.RO 
[R0.A1 ],R1 
R1.CM.A2] 

'".BOOT.EXCV LOOP 


ADOR: (XCALL_VECT0RS«SYS_LEN_BITS) |0S XVECTORS LENGTH 
R0,A1 

R0°A2 (0S - XVECT0RS - B * 5E<<SYS - LEN - BITS )IOS.XVECTORS.LENGTH 

OslxVECTORS LENGTH 
OP: 

RO,~_BOOTSTACKS 

RO.I.RO 

CR0.A1],R1 

R1.CM.A2] 

~_BOOT.XCALLV.LOOP 

i Set up stacks 


.BOOTSTACKS: 

X 


WRITER RO.SP 
WRITER RO.SP* 


; RO <- 0 


; Invalidate Queue registers 


WRITER ^“*^ YS - INVA0R| (OS_QUeUE0_SAS£«SYS_LEN_BITS) | OS.QUEUEO.MASK 
WTER ^[ OS - QUEUEO - B «E«SYS_L£N_8ITS) 

Sriter J2 i 2i2T 5 - 1NVW,R|<OS - QUeU61 - MSC « SYS -LEN.8ITS)|OS_QUEUE1 MASK 

Snm 


; Set up XLATE cache 


WRITER RO°TBM° S '' CACHE-BASE<<SYS -L EN - 8 I T S)tOS.CACHE.MASK 
; Initialize OS variables 

SEvf “-LOCKED_BAS£*OS_LOCKEO_LENGTH j RO <- Initial heap base 


R0.R2 

VAR.HEAP BASE 
R2.CR0.A0] 

VAR FREETOP 
R2.CR0.AO] 


Copy to R2 

RO <- Offset to HEAP BASE var 
Store In VAR HEAP BASE 
RO <- Offset to FREETOP var. 
Store In VAR FREETOP 



DC VAR ROM START 

MOVE [ROTaOLRI 

OC OS_INITIAL_BRAT LENGTH 

MOVE R0.R2 

SUB R1.RO.R1 

DC VAR_0RAT_BASE 

MOVE R1,[RO,AO] 

OC VAR BRAT LENGTH 

MOVE R2, [ RO, AO ] 

DC OS_INITIAL_8RAT MASK 

MOVE R0,R2 

DC VAR BRAT HASH MASK 

MOVE R2,[RO,AO] 

OC VAR NEXT ID 

MOVE R0.R2 

MOVE 0,R0 

MOVE R0.[R2,A0] 

DC VAR LAST ID 

MOVE R0,R2 

OC SVS_ID_ID_MASK 

MOVE R0.CR2.A0] 

X VAR MCACHE BASE 

MOVE R0.R1 

X OS MCACHE BASE 

MOVE RO,£R1,AO] 

X VAR MCACHE LENGTH 

MOVE R0.R1 

X OS.MCACHE LENGTH 

MOVE R0,[R1,A0] 

X VAR MCACHE OVERFLOW LIST 

MOVE NIL.R1 

MOVE R1.CRO.AO] 


RO <- Offset to R0M_START var 
R1 <- First ROM location 
RO <- Initial slza of BRAT 
Copy length to R2 
R1 <- Base of BRAT 
RO <- Offset to BRAT BASE var 
Store In VAR_8RAT BASE 
RO <- Offset to BRAT LEN var 
Store len In VAR_BRAT_LENGTH 
RO <- Initial BRAT hash mask 
Move to R2 

RO <- Offset to hash mask 
Put Initial hash mask In var 
RO <- Offset to NEXT_IO var 
Copy to R2 for safe keeping 
RO <- 0 

VAR_NEXT_ID <- 0 
RO <- Offset to LAST.ID var 
Copy to R2 for safe keeping 
RO <- ID field mask 
(same as last ID) 

Put last ID In VAR_LAST_ID 

RO <- Offset to mcache var 
Swap to R1 

RO <- Initial base value 
Set MCACHE.BASE variable 
RO <- Offset to mcache length 
Swap to R1 

RO <• Initial length value 
Set MCACHE.LENGTH variable 
RO <- Addr of oflow list 
R1 <- NIL 

Set oflow list to NIL 


; Fill Context free list with a few contexts 


BOOT_CFREE INIT: 

MOVE 3,R3 

MOVE NIL.RO 

PUSH RO 

BOOT.CFREE INIT LOOP: 

X CONT NORMAL SIZE 

CALL TRAP NEW CONTEXT 

MOVE [OBJECT lD,A1],R1 

POP R2 

PUSH R1 

MOVE R2,[CONT_NEXT_CONTEXT,A1] 
SUB R3.1.R3 

BNZ R3, '*'0OOT_CFREE INIT LOOP 

X VAR CFREE LIST 

POP R1 

MOVE R1,[RO,AO] 


R2 <- Number of ctxts to make 
RO <- NIL 

Push NIL on the stack 

RO <- Size of normal context 
A1 <- New context address 
R1 <- Context ID 
R2 <- Old cfree list 
Push new cfree list 
Next context • Old cfree list 
Decrement ctxts left to make 
Loop 

RO <- Offset to cfree list 

R1 <- Cfree list 

Set up Cfree list variable 


; Enable message reception by masking off disable bits 


BOOT_ENABLE_QUEUES: 


DC 

-SYS_INVADR 

; RO <- All bits BUT tl 
; Invalid address bit 

REAOR 

Q6M.R1 

AND 

R1.R0.R1 

; Mask off disable bit 

WRITER 

R1.QBM 

READR 

QBM'.RI 


AND 

R1.R0.R1 

; Mask off disable bit 

WRITER 

R1.Q8M' 

MOVE 

FALSE.RO 


WRITER 

RO, I 


BR 

~BKGD EXC 



BOOT_END: 


BACKGROUND LOOPS 


DIE TRP: 

BR 

EMPTY_FAULT: 

8ft 

EMPTY,TRAP: 
BR 

EMPTY XCALL: 
BR 

PUSH_EXC: 

BR 

POP_EXC: 


A t>IE_TRP 
~EMPTY_F AULT 
^EMPTY_TRAP 
"EMPTY_XCALL 
~PUSH_EXC 
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PRIMITIVE MESSAGES 


WRITE.MSG — Message routine to write a block of data to consecutive 
locations. 

WRITE (destination-address) (data). 


WRITE MSG: 

MOVE 
MOVE 
DC 
MOVE 
WTAG 
AND 
MOVE 
MOVE 
.WRITE MSG1: 

GE 

BT 

MOVE 

MOVE 

AOO 

ADO 

Qft 

_WRITE_MSG_EXIT: 

SUSPEND 
WRITE.MSG ENO: 


[1.A3],R0 

R0.A2 

SYS LEN MASK 

C0.A3],R2 

R2,TAG_INT,R2 

RO.R2.R2 

2,R0 

0,R1 

R0.R2.R3 

R3,-'_WRITE_MSG EXIT 

CR0.A3J.R3 

RZ.CR1.A2] 

R0.1.RO 

R1.1.R1 

"-WRITE.MSG1 


RO <- Destination address 
Move to A2 

RO <- Mask to keep len bits 
RZ <- message header 
Cast header into an INT 
R2 <- message length 
RO <- Src offset Into queue 
Ri <- Oest offset into A2 

Are we at the end of message? 

If so, exit 

Get a "hunk o’ data" 

Toss It into the destination 







-- Message routine to read a block of data to consecutive 
locations. 

READ (source-address) (reply-node) (reply-header) 


READ MSG: 


MOVE 

[1.A3],R1 

MOVE 

R1.A2 

SEND 

C2.A3J 

DC 

SYS LEN MASK 

ANO 

ri,ro,rT 

BNZ 

R1,~_READ MSGO 

SENOE 

C3.A3] 

SUSPEND 
_R£A0 MSGO: 

SUB 

R1.1.R1 

MOVE 

0.R2 

SENO 

[3,A31 

„kEA0 MSG1: 

EQUAL 

R1.R2.R0 

BT 

RO.^READ MSG2 

SEND 

CR2.A2] 

ADO 

R2.1.R2 

BR 

*_REA0_MSG1 

_REAO_MSG2: 


SENOE 

CR2.A21 

SUSPEND 

REA0_MSG_EN0: 


; R1 <- address/1en of source 
; Copy to A2 

! Send reply node number 
i RO <- Mask to keep length 
; R1 <- length 
; If length !■ o, continue 
; If no length. Just mall hdr 


; Convert length to offset 
; Initialize Index 
; Send reply header 

; Is Index • final Index? 

; If so, use SENOE Instead 
; Send a word of data 
; Increment source Index 
; Loop again 

; Send final word 





; CALL_MSG -- Message routine to run a method 
• CALL (method-id) (method-spec1f1c-args)» 

CALL.MSG: 


MOVE 

[1.A3J.R2 

i R2 <- Method-Id 

XLATE 

R2.R0,XLATE METHOO 

i RO <- Method address 

CHECK 

RO.TAG- INT.R1 

; Is this a hint? 

X 

IP:2 

; IP <- Offset of 2 into method 

PUSH 

RO 


POP 

IP 



CALL_MSG_END: 


; $ENO_MSG -- Message routine to take an object 

d, and send the object 

; referenced by the ID the selector "selector-symbol' 1 . If the object 

; Is local, the method Is run. If the object Is on another node, 

; we forward the message to the node. 


; SEND (selector-symbol) (object-id) (args)* 


SEND MSG: 



BR 

~SENO MSG START 

Jump to main coda 

SEND MSG FORWARD TO HOME: 


LSH 

R1.-SYS ID ID BITS.R1 

Shift Blrthnode number dowi 

AND 

R1.SYS ID NOOE MASK.RO 

Just keep node number field 

SENO MSG FORWARD TO HINT: 


SEND 

RO 

Send dest. node number 

SUB 

R3.1.R3 

R3 <• Index to last In queue 

MOVE 

0,R0 

RO <- 0 

SEND_MSG_FORWARO LOOP: 


EQUAL 

R0.R3.R2 

Are we at last item? 

BT 

R2,~SENO MSG FORWARD EXIT 

If so, send with SENOE 

SENO 

[R0.A3] 

Send item from queue 

AOO 

R0.1.R0 

Increment RO 

BR 

~SENO_MSG FORWARD LOOP 


SENO MSG FORWARD EXIT: 


SENDE 

[RO,A3] 


SUSPEND 



SENO_MSG_START: 



MOVE 

[0,A3J.R0 

RO <- Message header 

ANO 

RO,SYS LEN_MASK,R3 

R3 <- Length of message 

MOVE 

[2,A3],R1 

R1 <- Object 10 

XLATE 

R1,RO,XLATE_LOCAL 

RO <• Bound value of obj ID 

BNIL 

RO.-'SEND MSG FORWARD TO HOME 

If rcvr not here, forwerd meg 

CHECK 

R0,TAG_INT,R2 

Is value a hint? 

BT 

R2,~SENO_MSG_FORWARO TO HINT 

If so, forward msg to object 

SUB 

R3.3.R3 

R3 <- Length of args 

MOVE 

RO, A2 

Copy address to A2 

MOVE 

[OBJECT_HOR,A2],R1 

R1 <- Header of object 

LSH 

R1.-SYS_LEN_BITS,R1 

Shift class down 

ANO 

R1,SYS CLASS MASK.R1 

R1 <- Class 

DC 

SYS_SELECT0R_8ITS 

R0 <- Bits of selector field 

LSH 

R1.R0.R1 

Shift Clasa field up 

OR 

R1,C1,A3J,R1 

Merge with selector 

WTAG 

R1.TAG CS.R1 

Tag as a class/selector 

XLATE 

R1.R2,XLATE METHOO 

R2 <- Method ID 

DC 

MSG:(CALL_MSG«SYS LEN BITS) 

R0 <- Msg Header w/o length 

ADO 

R3.2.R3 

Ri <- Length of CALL message 

OR 

RO.R1.RO 

Merge with message length 

MOVE 

R2.R1 

Copy Method-ID to RI 

CALL 

TRAP_ID_TO_NOOE 

RI <- Node(Method-ID) 

SEND2 

R1.R0 

Send node, header 

SUB 

R3,2,R3 

R3 <- Length of args 

BZ 

R3,'SENO_MSG SENO LAST 

If no args, Just send meth-IO 

SENO 

R2 

Send Method-ID 

MOVE 

3,R0 

R0 <- Offset to args 

SEND MSG LOOP: 



MOVE 

[R0,A3],R2 

R2 <- Argument from queue 

ADO 

RO.T.RO 

Increment arg offset 

SUB 

R3,1,R3 

Decrement length 

BZ 

R3,""SEND_KS6_SEN0 LAST 

If last arg, send a end 

SENO 

R2 

Send argument 

8R 

""SEND MSG LOOP 

Loop 

SENO MSG SEND LAST: 


SENDE 

R2 

Send R2 and end 

SUSPEND 



SENO MSG END: 










NEW_METHOO -- Message handler to allocate and fill a method for a given 
class/selector pair. This routine calls the Instal1Method handler 
to make the class/selector/ID bindings, but this routine suspends 
after calling InstalIMethod, without waiting for it to complete. 

N£W_M£THOD (class) (selector) (slze-of*code) (code)* 


NEW METHOO MSG: 
MOVE 

[3.A3J.R0 

ADO 

R0.2.R0 

MOVE 

CLASS METHOO,R1 

CALL 

TRAP NEW 

XLATE 

R0.A2,XLATE OBJ 

MOVE 

4,R1 

MOVE 

2.R2 

MOVE 

[3.A3J.R0 

NEW METHOO MSG 

LOOP: 

8Z~ 

R0,~NEW METHOO MSG INSTALL 

MOVE 

[R1.A3J.R3 

MOVE 

R3.CR2.A2] 

SUB 

R0.1.R0 

ADO 

R1.1.R1 

ADO 

R2.1.R2 

BR 

'“NEW METHOO MSG LOOP 

NEW METHOO MSG 

INSTALL: 

MOVE 

NNR.R1 

DC 

MSG: (CALL MSG«SYS LEN BITS) 14 

SEN02 

R1.R0 ~ “ 

DC 

HANOLER INSTALL METHOO 

SEND 

RO 

SENO 

Cl.A3] 

SENO 

C2.A3J 

SENOE 

C 0BJECT_ID, A2 ] 

SUSPEND 



NEW_METHOO_MSG_ENO: 


RO <- Size of code 
Add In 2 header words 
R1 <- “Method* class 
Allocate an object 
A2 <- Address of object 
R1 <- Source offset 
R2 <- Oest offset 
RO <- Size of code 

If no size left then Install 

R3 <- Data word 

Put data word In object 

Decrement size 

Increment source 

Increment destination 

Loop 

R1 <- This node number 

RO <- header 

Send node,header 

RO <- 10 of InstalIMethod 

Send InstalIMethod ID 

Send class 

Send selector 

Send method ID A end 




NEW_MSG -- Message routine to create a new Instance of a certain class and 
mail back the 10. 

NEW (slze-of-object) (class) (reply-id) (reply-selector) (optional-data)* 


NEW MSG: 

MOVE [1,A3],R0 

MOVE [2.A3],R1 

CALL TRAP NEW 

XLATE RO.A2,XLATE_OBJ 

; •** Copy Optional Data *** 

DC SYSJ.EN MASK 

MOVE [0,A3],R1 

WTAG R1,TAG INT.R1 

AND R0.R1.R0 

SUB RO.S.RO 


MOVE 5.R1 

MOVE 2,R2 

_NEW_MSG1: 

8Z R0,~_NEW MSGEXIT 

SUB R0.1.R0 

MOVE [R1,A3].R3 

MOVE R3,[R2,A2] 

AOO R1.1.R1 

ADO R2.1.R2 

BR ~_NEW MSG1 

_NEW_MSGEXIT: 


MOVE [3,A3],R1 

DC INT: -SYS 10 10 BITS 

LSH R1.R0.R0 

SEND RO 

OC MSG:(SENO_MSG«SYS LEN BITS)|4 

SEND RO ~ 

SENO [3,A3] 

SENO [♦,A3] 

SENOE [1,A2] 

SUSPENO 
NEW_MSG END: 


i RO <- length of object 
: R1 <- class 
; Make a new object 
: A2 <- Address of object 


; RO <- low 10 bit mask 
; R1 <- Message header 
: Cast Into an INT 
; RO <- length of message 
; Ignore first 5 arguments, 

: leaving optional data 
; length In RO 
; R1 <- offset Into queue 
: R2 <- offset into object 

; If no data left, exit 
: Decrement count 
: R3 <- data from mag. stream 
; Store data In object 
; Increment offsets 

; Loop 

R1 <- reply Id 

RO <- t of bits of ID 

Shift node # down R put In RO 

Send destination node 

RO <- SENO message header 

Mall out the header 

Send the target Id 

Send the selector 

Send new otoj ID as final arg 







j METHOO_REQUEST_MSG — Look up a method and mal1 the ENTIRE method 
; Including headers to the requester In a METH00_REQUE5T_REPLY wrapper. 

: METHOO.REOUEST (method-ID) (reply-node) 

; Runs under: AO Absolute mode. Unchecked 


METHOD REQUEST MSG: 

MOVE [1,A3J,R1 
MOVE [2,A3],R2 
XLATE R1.A2.XLA 
OC SYS_LEN_M 


[1,A3J,R1 ; R1 <- Method ID 

[2,A3],R2 ; Rj <- Requester node t 

R1.A2,XLATE_METHOO : A2 <- Address of method 

SYS_LEN_MASK i RO <- Mssk to keep len field 

R0.A2.R3 ; R3 <- Length of method 

R3,2,R3 : R3 <- Length of method 

; ♦ 2 words for msg A 10, 

yielding messege length 

MSG: ( M£THOO_REQUEST_REPLY_MSG«SYS_LEN_BITS ) I SYS_UNC 


RO.R3.RO 

R2.R0 

R1 

R3.2.R3 

0,R0 


_METHOO_REQUEST_LOOP: 


_METHOO_REQUEST SENO LAST: 
SENOE [R0.A2] 
SUSPENO 

METHOO.REQUEST MSG ENO: 


R3.1.R3 

R3, ' v _METHOO_REOUEST SENO LAST 
CR0.A2] 

R0.1.R0 

^_METHOO_REQUEST LOOP 


RO <- Messege header 
Send dest node# A msg header 
Send method-10 
R3 <- Method length 
Current Index • 0 

Deerament length 

If length • 0. send last word 

Mall out method word 

Increment Index 

Loop 

Send final method word 






; METHOO_REOUEST_REPLY_MSG -- Store the method In an object and restart the 
; wait list. 

• METHOO_REQUEST_REPLY (method-ID) (method-data)* 

; Runs under: AO absolute mode. Unchecked 


METHOO_REOU£ST_REPLY_MSG: 


POP 

SUB 

MOVE 

MOVE 

M_R_R_FILL OBJ: 
BZ 

MOVE 

MOVE 

ADO 

AOO 

SUB 


SYS_LEN_MASK 

R0,[0,A3],R0 

RO 

R0.2.R0 

CLASS_METHOO,R1 
TRAP NEW 
R0.A2.XLATE OBJ 
SYS COPY MASK 
RO,[OBJECT HDR,A2],R0 
RO, [ OBJECT_HOR, A2 ] 

RO 

RO.A.RO 

4.R2 

2,R1 

RO,*M_R R COPIED 

[R2,A3],R3 

R3,[R1,A2] 

R1.1.R1 

R2.1.R2 

R0.1.R0 


"M_R_R_FI LL_06J 


M R R COPIED: 
MOVE 
MOVE 


MOVE [1,A3J.R0 

MOVE A2,R1 

ENTER RO.RI 

MOVE XCALL_BRAT_ENTER NEW.R3 
CALL TRAP_XCALL 

DC VAR MCACHE BASE 

MOVE [RO,AO J,R2~ 

OC VAR_MCACHE_LENGTH 

MOVE [R0,A0],R3 

MOVE [1,A3],R1 

ADO R2.R3.R2 

Seerch the Method Cache directory. 


M_R_R_SEARCH_MC_ID: 


SUB R2.2.R2 

SUB R3,2,R3 

EQ R1.[R2,A0],R0 

BT RO.'M R R FOUNO MC ID 

BNZ R3,'M_R_RISEARCH MC ID 

BR 'M_R_R_NOT IN MCACHE 

M R R FOUND MC ID: 

MOVE " NIL.RO 
MOVE RO.CR2.AO] 

ADO R2.1.R2 

MOVE [R2,A0],R3 

MOVE RO, [ R2, AO ] 

M_R_R_RESTART CTXT FROM MCACHE: 

8NIL R3,*M R R EXIT 


REAOR 

SENO 

DC 

SENO 

SENOE 

XLATE 

MOVE 

BR 

M R R EXIT: 

SUSPEND 


: RO <- Mask to keep length 
: RO <- Length of message 
; Save RO on stack 
: Ignore message header A ID 
: Ri <- Class of a method 
; Make a method object 
; A2 <- Address of object 
i RO <- Copy bit 
: RO <- Hdr marked as a copy 
; Mark object as a copy 

; Restore RO (length of msg) 

; RO <- Len of method w/o hdrs 
; R2 <- Source Index 
; RI <- Destination Index 

; If no more length, exit loop 
i R3 <- Word from message 
; Put word in method object 
; Increment source Index 
; Increment destination Index 
; Decrement length left 
: Loop 


RO <- Original method-IO 
RI <- Method copy address 
Enter In XLATE cache 
R3 <- BRAT EntarNaw Xcall * 
Enter In BRAT 


R2 <- Offset to method cache 

R3 <- Word size of cache 
RI <- Method 10 from message 
R2 <- Offset past mcache 


Decrement offset 
Decrement length 
Is this the Id we want? 

If so, branch t 

If length !> 0, loop 

If not In MC, check oflow list 

RO <- NIL 
Set ID To NIL 
Point offset to wait list 
R3 <- (ear wait-list) 

Set wait list to NIL 


NNR.R2 ; 
R2 . 
MSG:(RESTART_CONTEXT_MSG<<SYS LEN 
RO ” ;■ 
R3 • 
R3,A2,XLATE_OBJ • 
[CONT_NEXT_CONTEXT.A2],R3 j 
/ 'M_R_R_RE5TART_CTXT FROM MCACHE 


If context ID Is nil, exit 
R2 <- This NNR 
Send a message to this node 
.BITS)|2|SYS_UNC 
Sand message header 
Send ID to restart 
Get address of context 
R3 <- next ctxt 10 In list 


If not in MCACHE directory, search overflow list. Use R2 to hold 
the previous context ID, and R3 the current context ID. Use these 
pointers to delink Items from the overflow list. 


M_R_R_NOT_IN_MCACHE: 

MOVE NIL.R2 ; No previous ID 

DC VAR_MCACHE_OVERFLOW_LIST j RO <- Addr of Oflow list 

MOVE [R0,A0],R3 ; R3 <- Car of overflow list 

M_R R LOOP THRU OVERFLOW LIST: 

BNIL R3.'M_R_R_EXIT . vhten list NIL. exit 

XLATE R3,A2,XLATE_OBJ ; A2 <- Context Addr 

EQ Ri,[CONT_R£SOURCE,A2],RO j Waiting for this method? 

BT RO,X_R_R_UNLINK_CTXT ; If so. cut ctxt out of list 













MOVE R3.R2 

MOVE [CONT_NEXT_CONTEXT,A2],R3 

0* 'M_R_R_LOOP_THRU overflow list 

M_R_R_UNLINK_CTXT: ' ' 

BNNIL R2,X_R_R_UNLINK MI DOLE CONTEXT 

M_R_R_UNLINK_FIRST_CONTEXT: ~ 


MOVE 
DC 

MOVE 
MOVE 
MOVE 
BA 

M_R_R LILYPAO: 
8R 


[CONT_NEXT_CONTEXT,A2],R3 
VAR_MCACHE_OVERFLOW LIST 
R3,[R0,AO] 

R2,[CONT_NEXT CONTEXT,A2 ] 

[OBJECT_ID.A23.ro 

'M_R_R_RESTART_CTXT_FROM_LIST 


Pp«v ID <- Current ID 
R3 <- next etxt ID In list 


; If prev !• nil, link to next 

R3 <- Next context 
RO <- Addr of oflow list 
Overflow list <- Next ctxt 
Next context ptr <- NIL 
RO <- Ctxt ID 

Queue up context for execution 


M_R_R_UNaNK_MIDDU-!c^TE°^7 THRU - 0VERFL0W - aST ‘ ** * to be 


MOVE [CONT_NEXT CONTEXT.A21.R3 

MOVE NIL.RO 

MOVE RO,[C0NT_NEXT CONTEXT,A21 

MOVE [OBJECT_IO.A2J.ro 

XLATE R2.A2.XLATE OBJ 

MOVE R3,[C0NT NEXT CONTEXT.A21 

M_R_R_RESTART_CTXT FROM LIST: - J 


PUSH 
REAOR 
SEND 
OC 
SEND 
POP 
SENDE 
8R 


METHOO_R£OUEST_REPLY — END: 


RO 
NNR.RO 

ro : 

MSB: ( R£5TART_C0NTEXT MSG«SYS LEN 
RO “ ~ 

RO 

RO 

'M_R_R_LILYPAO 


R3 <- Next context 
RO <- NIL 

Next context <- NIL 
RO <- ID to clipped-out ctxt 
A2 <- Prev context addr 
Prev o--> Next (skipping curr) 

Save context ID 
RO <- This NNR 
Send a message to this node 
.BITS)|2|SY5_UNC 
Send message header 
Restore context ID 
Send ID to restart 
Go to next element In list 
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MIGRATE_08JECT_MSG — Move an object to a new node 
MIGRATE_OBJECT (object-id) (node-number) 

Runs under: AO Absolute mode 


MIGRATE_OBJECT_MSG: 

MOVE [1.A3J.R0 ; R0 <- Object ID 

MOVE C2.A3J.R1 : R1 <- Oest node number 

MOVE XCALL_MIGRATE_08JECT.R3 

CALL TRAP_XCALL ; Migrate the object 

SUSPEND 

MIGRATE_06JECT_MSG_EN0: 


IKHIGRATE_OBJECT_MSG -- Let this object reside on this node 
I*MGRATE_OBJECT (object-id) (previous-residence) (object-data)t 
Runs under: AO Absolute mode, unchecked 


IMMIGRATE_OBJECT_MSG: 


PUSH 
MOVE 
MOVE 
MOVE 
AND 
PUSH 
SUB 
MOVE 
LSH 
AND 
CALL 
MOVE 
OC 
OR 
MOVE 
MOVE 
MOVE 
ENTER 
MOVE 
CALL 
MOVE 
POP 
SUB 
SUB 

IMMIGRATE.OBJECT LOOP: 

EQUAL R1.4.R2 
BT 

MOVE 
MOVE 
SUB 
SUB 
BR 


TRUE, R3 
R3.I 

[0,A3J.R0 

RO.SYS LEN MA5K.R0 
R0 

R0.3.R0 

[3.A3J.R1 

R1.-SYS LEN BITS.R1 
R1.SYS CLASS MASK.R1 
TRAP MALLOC 
C3.A3J.R2 

SYS.UNMOVABLE MASK 

R2.R0.R2 

R2.C0.A2J 

C1.A3J.R0 

A2.R1 

R0.R1 

XCALL_BRAT_ENTER NEW.R3 
TRAP XCALL 
R0.C1.A2J 
R0 

R0.1.R1 

R0.4.R0 


IhMGRATEjDBJECT EXIT: 

I 


R2.~INMIGRATE_OBJECT EXIT 

CR1.A3J.R2 

R2.CR0.A2J 

R0.1.R0 

R1.1.R1 

IftllGRATE OBJECT LOOP 


POP 
DC 

SEND2 
MOVE 
SENDEE 
SUSPEND 
IhWIGRATE.OBJECT MSG ENO 


MSG:5YSJJNCI(NOW RESIDING AT 

C2,A3J.R0 

NNR.RO 

C1.A3J.R0 


Save Interrupt status 
R3 <- True 
Disable Interrupts 
RO <- Message header 
RO <- Message length 
Save message length 
RO <- Object length 
R1 <- Object header 
Shift class down 
R1 <- Class of object 
Mai locate me son _ 

R2 <- Object header 
RO <- Unmovable bit 
Set unmovable bit In header 
Set header of new object 
RO <- Object ID 
R1 <- Address of block 
Enter ID/AOOR In XLATE table 
R3 <- BRAT EnterNew Xeall # 
Enter In BRAT 
Fill 2nd slot with ID 
R0 <- Message length 
R1 <- Offset to last msg word 
R0 <- Offset to end of dost 

At first data word? 

If so, done 

R2 <- data word 

Put data word in object 

Decrement R0 

Decrement R1 

Loop 


; Pop Int. disable flag 
_MSG«SYS_L£N_BITS) 13 

: Send previous node t, header 
; R0 <- This node number 
; Send obj ID and this node * 


NOW_RESIDING_AT_MSC — Notify old residence of new residence It tell blrthnode 
NOV_RESIDING_AT (object-id) (residence-node) 

Runs under: AO Absolute mode, unchecked 


NOW_RESIDING_AT_MSG: 


MOVE R0.RB 

HOVE Cl.A3J.R0 

HOVE C2.A3J.R1 

ENTER R0.R1 

HOVE XCALL BRAT ENTER,R3 

CALL TRAP XCALL 

MOVE C1.A3J.R1 

LSH R1,-SYS_I0 ID BITS.R1 

VTAG R1.TAG INT.R1 - 

DC MSG:SYS_UNC | (UPDATE BIRTHNOOE 

SEND2 R1.R0 

SEND C1.A3J 

SEND C2.A3J 

MOVE NNR.RO 


NOP to prevent EARLY Fault 
R0 <- Object ID 
Rl <- Residence Node t 
Cache R0 -> Rl 
R3 <- BRAT_ENTER Xeall # 
Bind In BRAT 
Rl <- Object ID 
Shift Blrthnode number down 
. Set tag to INT 
,MSG«SYS_LEN_BITS) | A 

; Send header to blrthnode 
; Send object ID 
: Send new residence node 
; R0 <- This node # 








SENDE RO 
SUSPEND 

NOW RESIDING AT_MSG ENO: 


Send t as previous residence 


uPDATE_BIRTHNOOE_KSG -- Notify the blrthnode of the new residence, and 
mark the object movable 

JPOATE_BIRTHNOOE (objected) (residence-node) (previous-node) 

Runs under: AO Absolute mode, unchecked 


UPOATE.BIRTHNOOE MSG: 

MOVE 

NNR.R2 

MOVE 

[1.A3],R0 

MOVE 

[2,A3],R1 

MOVE 

C3.A3j.R3 

EQUAL 

R3.R2.R2 

BT 

R2.-UPOATE 1 

ENTER 

R0.R1 

MOVE 

XCALL.BRAT 1 

CALL 

TRAP XCALL 


R2 <- This node t 
RO <- Object ID 
Ri <- Residence Node # 

R3 <- Previous node t 
Was guy previously here? 
If so, don’t rebind again 
Cache RO -> R1 
R3 <- BRAT_ENTER Xcall t 

UPOATE_BIRTHNOOE_MOV ABLE'"" * 81 " d 1n 

DC KSG:5YS_UNC| (OBJECT_MOVABLE_M5G«SYS LEN BITS) 12 
fpune ?]’??, : S *" d h ® ,tS#r t0 residence 

!w«N 0 C ,A33 : Send object ID 

UPOATE_8IRTHNOOE_MSG_END: 


06JECT_MOVABLE_MSG -- Mark the object movable 
OBJECT_MOVABLE (object-id) 

Runs under: AO Absolute mode, unchecked 


OBJECT MOVABLE M$G: 

MOVE - RO.RO 
MOVE 
XLATE 
MOVE 
DC 
AND 
MOVE 
SUSPEND 
OBJECT.MOVABLE MSG END: 


[1.A3J.R0 
R0.A2,XLATE OBJ 
C0.A2j.R1 

-SYS_UNMOVA8LE MAS) 
RI,RO, RI 
R1.C0.A2] 


NOP to prevent EARLY fault 

RO <- Object ID 

A2 <- Object address 

RI <- Object header 

RO <- All but unmovable bit 

RI <- Movable object header 

Put header back in object 









SYSTEM CALL TRAPS 


XCALL_TRP -- Call an extended system call 

Runs under: AO absolute mode, unchecked 
Inputs: R3 

Trashes: R3 


XCALL TRP: 

PUSH RO 

DC OS_XVECTORS_BAS£ 

AOO R0,R3,R3 

MOVE CR3.A0],R3 
POP RO 

MOVE R3,IP 

XCALL TRP END: 


Save RO 

RO <- Base of xvectors 
R3 <- Xvectors ♦ xcall t 
R3 <- Xcall routine IP 
Restore RO 
Go to XCALL routine 


SWEEP_TRP -- Sweep al1 non-marked objects In the heap down 
towards the base. 

Runs under: AO shadow 


SVEEP_TRP: 

B£ 

.SWEEP EXIT: 

DC 

MOVE 

POP 

POP 

POP 

POP 

POP 

POP 

SWEEP.TRP START 
PUSH 
PUSH 
PUSH 
PUSH 
DC 

MOVE 
MOVE 
.SWEEP LOOP: 

PUSH 

MOVE 

MOVE 

DC 

MOVE 

GE 

BT 

.SWEEP CONTINUE 
OC 
ANO 
BZ 
AOO 
MOVE 
PURGE 
MOVE 
CALL 
SUB 
MOVE 
ANO 
AOO 

.SWEEP.ITERATE: 


"SWEEP.TRP.ST ART 

VAR FREETOP 
R1,[R0,A0] 

I 

R3 

R2 

R1 

RO 

IP 

RO 

R1 

R2 

R3 

VAR HEAP BASE 

[R0.A0J.R2 

R2.R1 

I 

TRUE, RO 
RO, I 

VAR.FREETOP 

[RO.AOJ.RO 

R2.R0.R0 

RO, '".SWEEP.EXIT 

SYS.MARK MASK 
R0,[R2,A0J,R0 
RO,'' SWEEP COPY 
R2,1, R2 
[R2,A0],R0 
RO 

XCALL.BRAT_PURGE.R3 
TRAP XCALL 
R2,1,R2 
[R2.A0J.R0 
RO.SYS.LEN MASK.RO 
R2.R0.R2 


BR 

.SWEEP.COPY 


'.SWEEP.LOOP 


MOVE [R2.A0J.R0 

ANO RO.SYS.LEN MASK.RO 

AOO R2.R0.R2 

AOO R1.R0.R1 

EQUAL R1.R2.R3 

BT R3.-\SWEEP_ITERATE 


.SWEEP.COPY LOOP: 


8NZ RO, ''.SWEEP COPY LOOP2 

LSH R1.SYS LEN BITS7R3 

MOVE [R1.A0J.R0 - 
AND RO.SYS.LEN MASK.RO 

OR R0.R3.R0 

OR RO,SYS.REL MASK.RO 

NTAG RO,TAG AOOR.RO 

PUSH R1 


; Go to main code 

; RO <- &FREETOP 
: FREETOP <- New destination 


; RO <- Address of HEAP.BASE 
; R2 <- Initial source ~ 

; R1 <- Initial destination 


i RO <- True 
; Prevent Interrupts 
1 RO <- 4FREETOP 
: RO <- End of heap 
; At or past the end of heap? 
: If so, then exit 

; RO <- Deletion flag mask 
; RO <- Only deletion bit 
; If not deleted, copy object 
: R2 <- Offset to object 10 
i RO <- Object ID 
; Remove object ID from cache 
; R3 <- BRAT Purge Xcall t 
: Remove object 10 from BRAT 
i Make R2 be offset to object 
; RO <- Header of object 
; RO <- Length of object 
: Point src to next object 

: Go to next Iteration 

; RO <- Header of object 
; RO <- Length of object 
; R2 <- End of src 
i R1 <- End of dost 
; Does src - dost? 

; If so. go to next object 

; If RO !■ 0 continue copying 
; R3 <- dest.addr << len bits 
i RO <- Header of object - 
i RO <- Length of object 
; RO <- base I len 
; Mark RO as relocatable 
i Tag as an address 
; Save Rl 

















NEW.CONTEXT.TRP -- Create a context for a process 

This trap creates a context object when given the size of ergs 
and locals In RO. The context created looks like: 


start ♦ 0: 
start t 1: 
start + 2: 
start ♦ 3: 
start ♦ 4: 
start ♦ 5: 


pstate ♦ 1 : 
pstate t 2: 
pstate ♦ 3: 
pstate * 4: 
pstate ♦ 5: 
pstate ♦ 6: 
pstate ♦ 7: 
pstate * 6: 
pstate + 9: 


I_Header_| 

l_Context-ID_| 

PstateOffsetl (Offset from Header to pstate) 
Next-Context| 

_Resource_| 

Space | — 

WWW I 

wwwi -1- Len9th of spaee 1n R0 

_IDO_| (Method ID) 



I_R1_| 

I_R2_| 

i_R3_| 

_IP_I 


The address of the block Is returned In A1 4 A2. The accompany1 no 

HEADER 1 A^CONTEx? 1 TO w1th th * cont,xt ID. The 

* rB f111ed In by this routine. The 

L w1th NIL - 11 1s U P t0 application code 

to fill in the IDO-3, RO-3, and IP slots since these values may be 

f“mS ?n !H?th*th2 TRAP codB - Th * PSTATE-OFFSET flJnS la 

rll'S 9 1 thB offSBt from th « Header of the context. This field 

of context* 1 t0 **** th * bu1,d1n9 of * Pointer to the pstate portion 

If the space needed is <• the normal context size (defined 

S^sSTfc2S, , ;. t, - n * fast cont,xt 1a lll0CBtBd off ° f 


Runs under: AO absolute mode, unchecked 
Inputs: RO 

Outputs: A 1 ,ID1,A2,ID2 

Trashes: RO 


NEW_CONTEXT TRP: 

PUSH R1 
PUSH R2 
PUSH RO 


DC 

MOVE 

POP 

GT 

BT 

MOVE 

8NIL 

XUTE 

XLATE 

MOVE 

MOVE 

MOVE 

MOVE 

POP 

POP 


VAR_CFREE LIST 

R0.R2 

RO 

RO,CONT_NORMAL SI2E.R1 
R1 ,rNEW_CONTEXT TRP ALLOC 
[R2.A0J.R1 

R1,rNEW_C0NTEXT TRP ALLOC 
R1.A1,XLATE OBJ - ~ 
R1.A2,XLATE OBJ 
[CONT_NEXT_CONTEXT.A1 ],R0 
R0.CR2.A0] 

NIL.RO 

RO,[CONT NEXT CONTEXT,A1] 

R2 

R1 


POP IP 

NEW_CONTEXT TRP ALLOC: 

ADO R0.9.R0 

PUSH RO 


ADO 

MOVE 

CALL 

XLATE 

XLATE 

POP 


RO,CONT_PSTATE SIZE.RO 
CLASS_CONTEXT,R1 
TRAP NEW 
R0.A2.XUTE OBJ 
RO, A1, XUTE~0BJ 
RO - 


POP R2 

POP R1 


MOVE RO,CCONT PSTATE OFFSET,A21 

MOVE NIL.RO ' 

MOVE RO,[CONT NEXT CONTEXT,A21 

POP IP J 

NEW_CONTEXT_TRP_ENO: 


; Save R1 
; Save R2 
; Save RO 

; RO <- Base of Cfree list 
i Swap to R2 

: Restore RO with user size 
; Is size > normal size? 
i If so, allocate a new context 
i R1 <- 1st ctxt In free list 
; If no more normal, then alloc 
; A1 <- Context Addr 
; A2 <- Context Addr 
; RO <- Next Context 
: Point cfree list to next ctxt 
; RO <- NIL 

; Erase next ctxt ptr (for gc) 

; Restore R2 
; Restore R 1 
; Return 

; RO <- Offset to pstate 
; Save pstate offset 
; RO <- Total context obj size 
; R1 <- "context* class value 
; Make a new object 
: A2 <- Address of object 
; Copy to A1 
: Restore pstate offset 
; Restore R2 
; Restore R 1 

Fill PSTATE-OFFSET ctxt field 
RO <- NIL 
No next context 








NEV_TRP -- Trap to generate a new object 

Takes the size of the object In RO and the class In R 1 and allocates a block 
of memory for the object and assigns It a unique ID. The ID is 
returned In RO. The header is tagged as an object header, and the 
class/length field Is filled In. The ID slot Is filled with the 
newly generated ID for this object. In addition, the XLATE cache 
& SRAT are updated. 


Runs under: 
Inputs: 
Outputs: 
Trashes: 


AO Absolute mode, Unchecked 

R0.R1 

RO 

R1 


NEW_TRP: 


PUSH 

I 

PUSH 

A2 

PUSH 

R3 

MOVE 

TRUE,R3 

MOVE 

R3,I 

CALL 

TRAP MALLOC 

LSH 

R1,SYS_LEN BITS.R1 

OR 

R1,R0,R1 

WTAG 

R1.TAG OBJHEAO.R1 

MOVE 

R1.C0.A2] 

CALL 

TRAP GENID 

MOVE 

A2.R1 

ENTER 

R0.R1 

MOVE 

XCALL BRAT ENTER NEW.R3 

CALL 

TRAP XCALL 

MOVE 

R0.C1.A21 

POP 

R3 

POP 

A2 

POP 

I 

POP 

NEW TRP END: 

IP 


Push Int. disable flag 

Save A2 

Save R3 

R3 <- True 

Disable Interrupts 

Mai locate me some memory 

Shift class past len bits 

Merge class A length 

Tag class/length as objheader 

Fill 1st slot with class/len 

Generate an Id Into RO 

R1 <- Address of block 

Enter ID/AOOR In XLATE table 

R3 <- BRAT EnterNew Xcall t 

Enter In 8RAT 

Fill 2nd slot with ID 

Restore R3 

Restore A2 

Pop tnt. disable flag 
Return 






ID-T0 -^-IS P O hi.«* P t0 c' nd th# best node nun *>er"to hope to 

I«!! d . , ?* 0b <?£ t .f n ‘ Enter w1th the 10 of the object in ri 
end exit with the node number In RI. 


Runs under: 

AO 

Inputs: 

RI 

Outputs: 

RI 

l.TO.NOOE TRP: 


PUSH 

R2 

XLATE 

RI, 

CHECK 

RI, 


BF 


IO_TO_NOOE_LOCAL: 

MOVE NNR.R1 

ID_TO_NOOE_EXIT: 


R2, A ID~TO.NOOE.EXIT 


POP R2 

POP IP 


XLATE locally, nil If 
Does tag • AOORT 
If not, we are done 


unbound 


RI <- This node number 


Restore R2 
Return 


MALLOC.TRP 


Primitive memory allocator 


If the 
be cal 


Should 


; Runs under: 

AO sli 

: Input: 

RO 

; Output: 

A2 

MALLOC TRP: 


PUSH 

RO 

PUSH 

RI 

PUSH 

R2 

PUSH 

R3 

MOVE 

RO, 

OC 

VAR 


MOVE 

AOO 

DC 

MOVE 

SE 

BT 

LSH 

OR 

OR 

VTAG 

MOVE 

DC 

MOVE 

POP 

POP 

POP 

POP 

POP 

.MALLOC BAD: 
CALL 

MALLOC.TRP.ENO: 


CR0,A0],R2 

R2.R1.R3 

VAR BRAT BASE 

[RO.AOJ.RO 

R3.R0.R0 

RO, "'.MALLOC BAO 

R2, S YS.LEN.BITS, RO 

R0.R1.R0 

RO,SYS.REL.MA5K,RO 

bo.tag_aoor.ro 

R0.A2 

VAR.FREETOP 

R3.CRO.AO] 

R3 

R2 

RI 

RO 

IP 

TRAP.OIE 


; Copy length to Ri 
1 RO <- Offset to VAR FREETOP 
; R2 <- VAR.FREETOP ~ 

: R3 <- address ♦ length 
! 52 5* 0ffs,t to VAR BRAT BASE 
; RO <- Base of BRAT 
i Would new block be too big? 

1 c!!. if 11 •* •" error 
i Shift freetop base up 

i Merge in the length field 
; 5* rk ROdresa as relocatable 
; Cast into an AOOR 
i Copy to A2 
; RO <- VAR.FREETOP 
i Update new freetop 


Ole for now 







FREE_CONTEXT 


Free up the context In 101 


If the size of the context equals the normal fast context size then 
nL P ID C fo£ h ?t C ?( teXt b * Ck onto the free ,1st * fter 11 locating a 
t~ context 1 1 s marked%or reP ” M) - ° th8 ~ 1se 


Runs under: AO Absolute Mode 
Input: 101 

Trashes: 


FREE_CONTEXT TRP: 

PUSH RO 

PUSH R1 

MOVE ID1.R0 

CALL TRAP_FREE_SP£CIFIED CONTEXT 
POP R1 

POP RO 

POP IP 

FREE.CONTEXT TRP END: 


FREE.SPECIFIEO.CONTEXT — Free up the context specified In RO. 

If the : 1Z p U°c f e the ITnllZ SML T.2T2L '«•" 


^ P lS C f 0 r h U C nn ,Xt ?" t0 th « frM «n.r .Tiocit^r. 

the context f.^rSM d^u^ 1 " 9 C0 " UXt r8PllM) ' ° th#l 


Otherwise, 


Runs under 
Input: 

Trashes: 


AO Absolute Mode 
RO 

R0.R1 


FREE.SPECIFIEO.CONTEXT TRP: 


PUSH 

XLATE 

MOVE 

AND 

SUB 

SUB 

EQUAL 

BT 

MOVE 
OR 
MOVE 
SR 


FREE_CONTEXT_TRP_KEEPlHIM: 


A2 

R0.A2,XLATE OBJ 
[08JECT_H0R.A2J.R1 
R1.SYS_LEN_MASK.R1 
R1.A.R1 

R1,CONT_PSTATE SI2E.R1 
R1,CONT_NORMAL SIZE.R1 
R1.~FREE_CONTEXT TRP KEEP HI) 
t OBJECT_H0R,A2],R1 " 
R1,SYS_MARK MASK.R1 
R1.[OBJECT.HOR.A2J 
/> FREE_CONTEXT TRP EXIT 


Save A2 

A2 <- Addr of context 
R1 <- Header of context 
R1 <- Length of context 
Subtract A first words 
R1 <- User space size 
Is user space • normal size? 
If so, add him to the list 
R1 <- Header of context 
Sat deletion bit 
Move hdr back to object 


«** No longer need to generate new 10 •»« 


PURGE 

PUSH 

PUSH 

MOVE 

MOVE 

MOVE 

CALL 

CALL 

MOVE 

MOVE 

ENTER 

MOVE 

MOVE 

CALL 

POP 

POP 

oc 


RO 

I 

R3 

TRUE.R3 

R3,I 

XCALL_BRAT_PURGE,R3 

TRAP_XCALL 

TRAP_GENID 

RO,[OBJECT I0.A2J 

A2.R1 

R0.R1 

A2.R1 

XCALL.8RAT_ENTER.R3 

TRAP_XCALL 

R3 

I 

VAR_CFREE_LIST 


MOVE 
MOVE 
MOVE 
MOVE 

FRE£_CONTEXT TRP 


[RO.AOJ,R1 

R1, [ CONT_NEXT_CONTEXT, A 

[OBJECT_I0.A2J.R1 

Ri.[RO,aoj 

'.EXIT: 


POP A2 

POP IP 

FREE_SPECIFIED_CONTEXT TRP END: 


; Remove 10 RO from cache 

; Seva R3 
; R3 <- True 
: Disable Interrupts 
; R3 <- Purge Xcsll t 
; Remove ID from BRAT 
; Make a new ID 
; Patch new 10 Into context 
; Ri <- Context AOOR 
; Make new cache binding 
; RI <- Context Address 
: R3 <- Enter Xcall # 

; Enter binding In BRAT 
i Restore R3 
; Restore Interrupts 
; RO <- Offset to CFREE list 

; RI <- CFREE base 
; Put CFREE list as next ctxt 
; RI <- Object 10 
: CFREE list <- Context ID 

; Restore A2 
i Return 
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VERSION.TRP 


Return the 


version number 


Returns the version number in RO. 
where the high 16 Bits hold 
bits hold the minor version 


The version number is an INT tagged value 
the major version number and the low 16 
numBer. 


Runs under: 
Output: 

Trashes: 


AO Absolute Mode 
RO 

Internally: RO 

Totally: RO 


VERSION_TRP: 

~OC ROM VERSION 

MOVE [R0,A0],R0 
POP IP 

VERSION TRP ENO: 


XFERx_TRP -- Transfer execution to a context 


The routines XFER_ID_TRP and XFER.ADOR.TRP both transfer control to a context 
en^r wuh °: phys,c *' Posters. To transfer by ID 

The*con text * Is 1 FREEd an.^rds^ * “ ,th • 


Runs under: AO Absolute Mode 


XFER_IO_TRP 
Input: RO 

Trashes: Locally: R0.A0.A1 

Totally: R0.A0.A1 

XFER_ADDR_TRP 
Input: A1 

Trashes: Locally: RO.AO 

Totally: RO.AO 


Never returns. 


XFER_ID TRP: 

XLATE 

XFER_ADOR TRP: 

PUSH 
MOVE 
MOVE 
MOVE 
MOVE 
MOVE 
MOVE 
LSH 
ADO 
LSH 
ADO 
ADO 
MOVE 

XFER_AOOR CLR STACK: 

MOVE 0,R0 
WRITER RO.SP 


RO.A1,XLATE_OBJ 

I 

TRUE,RO 
R0,I 

[08JECT_I0,A1],R0 
RO,101 
R0.C7.A0] 

A1.R0 

ROi-SYS_LEN BITS,RO 
RO.[CONT_PSTATE OFFSET,A11,F 
RO,SYS_LEN_8ITS,RO 
RO,CCONT_PSTATE OFFSET.A11.E 
R0.1.R0 
R0.A1 


MOVE [PSTATE_IP,A1],R0 
PUSH RO 


MOVE 

WRITER 

MOVE 

WRITER 

MOVE 

WRITER 


[PSTATE_IDO.A1],RO 

RO.IOO 

CPSTATE_ID2,A1].R0 
RO, 102 

CPSTATE_ID3,A1],R0 
RO, 103 


Get context addr in A1 


RO <- True 
Disable interrupts 
RO <- Context ID 
Set 101 to context 10 
Store In current context ID 
RO <- Pointer to context 
Shift addr field down 
Add In offset to pstate 
Shift addr field up 
Add In pstate length - i 
RO <- ADOR:<ps_addrXps len> 
A1 <- Pointer to pstate - 

RO <- 0 

Flush stack preparing 
for context resume 
RO <- Old IP from context 
Push IP on stack 


HOVE [PSTATE R0,A1],R0 

HOVE [PSTATE - R1,A1],R1 

HOVE CPSTATE_R2,A1].R2 

MOVE [PSTATE_R3,A1].R3 

PUSH RO 

PUSH R1 

MOVE [08JECT_ID,A1],RO 

CALL TRAP_FR£E CONTEXT 

POP R1 

POP RO 


Save RO 
Save R1 

RO <- Context ID 
Free context 
Restore R1 
Restore RO 


INVAL 


Invalidate address regs 









HOVE IDO.RO 

BNIL RO, / 'XFER_ADOR CLR STACK 
POP I ~ 

XLATE RO,AO,XLATE METHOO 
POP IP 

XFER A DOR TRP END: 

XFER ID TRP END: 


RO <- Method-ID from context 
If IDO slot nil, don’t XLATE 

AO <- Address of method 
Trensfer execution to context 


0RAT_PEEK_TRP — Finds the current slot of the ID In the BRAT 

Runs under: AO Absolute Mode, Unchecked 
Inputs: R0.R1.A2 

Output: RO 


The IO R to hash to give first offset to start searching from Is In 

thi base of th.Splfuble Ro’^S/.T. s^et^J t0 

search 1 na*f or* 1 tn “°i 1d be dlff * r,nt would be If you were ” 

n22 iS 3?nc2 wi JoJlS wint P U taT. E 1u# ^ R0 “° u,d b « 

If the ID Is not In the brat, NIL Is returned In RO. 


BRAT_P£EK TRP: 

PUSH R2 
PUSH R3 


WTAG 

LSH 

XOR 

LSH 

XOR 

LSH 

XOR 

LSH 

DC 

MOVE 

AND 


: Convert the ID Into an Initial offset key Into the BRAT 


R0,TAG_INT,R0 

R0.-B.R2 

R0.R2.R3 

R2.-8.R2 

R0.R2.R3 

R2.-8.R2 

R0.R2.R3 

R3.1.R3 

VAR_BRAT_HASH MASK 
CR0.A0J.R0 " 

R3.R0.R3 

; Find the table length 


; Cast RO Into an INT 
R2 <- 10 » 8 
R3 <- 10 xor (ID » 8) 

R2 <- ID » 16 

R3 <- RO xor (ID » 16) 

R2 <- ID » 24 

R3 <- RO xor (ID » 24) 

R3 <- key * 2 . offset 
RO <- Offset to hash mask 
RO <- mask 

Now R3 holds key Into BRAT 


OC 

AND 


SYS_LEN MASK 
R0.A2.R2 


R2 <- BRAT length 


; Search for the ID starting at offset 


_BRAT_PEEK_L00P: 

EQ R1.’CR3?A23!rO I< - FAIL 
.BRAT_PEE B K_NEXT: R0 ^- BRAT - PeEK - 5OT - HI1 


SUB 

SUB 

LT 

BF 


R2.2.R2 

R3.2.R3 

R3.0.R0 

RO,' s _BRAT_PEEK LOOP 


If no more length, fall' 
Hava we found the target? 


Decrement length left 
Decrement current offset 
Is Offset < 0? 

If not, loop 


: We must wrap around to top of BRAT 


DC SYS_L£N MASK 

AND R0.A2.R3 

SUB R3.2.R3 

BR ~_8RAT_PEEK LOOP 


R3 <- Length of BRAT 
Point to top 10 slot In BRAT 


’ If ID not In table, we end up here 


_BRAT_PEEK_FAIL: 


MOVE 

NIL.R3 

_8RAT_PEEK GOT 

.HIM: 

MOVE 

R3.R0 

POP 

R3 

POP 

R2 

POP 

IP 


8RAT_PEEK_TRP_EN0: 


; R3 <- NIL 

; RO <- Offset of ID In BRAT 





EXTENDED 


CALL 


ROUTINES 


issxtsssttssssttsssttatstsxtssxsstssssssssssssssssssssstts 




BRAT_ENTER_XTRP — Add an ID/ADDR pair to the BRAT 

Runs Under: AO Absolute Mode, Unchecked Mode 
Inputs: R0,R1 

Takes and ID/ADOR pair in RO & R1 and enters the pair Into the BRAT. 


BRAT.ENTER XTRP: 

PUSH A2 
PUSH R3 
PUSH R2 
PUSH R1 
PUSH RO 


MOVE R0.R2 

MOVE R1,R3 

OC VAR_BRAT BASE 

MB'* [R0,A0],R1 

OC SYS.LEN BITS 

LSH R1,R0,R? 

OC VAR_BRAT_LENGTH 

OR R1,CR0,A0].R1 

WTAG R1.TAG AOOR.R1 

MOVE R1.A2 

MOVE R2,R0 

MOVE R0.R1 

CALL TRAP_8RAT PEEK 

BNNIL RO,~_BRAT ENTER OK 

MOVE R1,R0 

MOVE NIL.R1 

CALL TRAP BRAT PEEK 

BNNIL RO, -'_8RAT~ENTER OK 

CALL TRAP DIE 

_BRAT_ENTER_OK: 

MOVE R2,[R0,A2] 

AOO RO,1,RO 

MOVE R3,[R0.A2] 


; R2 <- ID 

; R3 <- ADOR 

i RO <- Offset to BRAT variable 

; R1 <- BRAT_BASE 

: Shift 8RAT_BASE to addr field 

i R1 <- BRAT base | length 

: Cast R1 into an AOOR 
; Move BRAT ptr Into A2 
; RO <- ID that was passed In 

; R1 <- ID that was passed In 

: Find offset A return In RO 
: If offset !■ nil, we got ID 
: RO <- ID (still in Rl) 

; Rl <- NIL 

; Find offset A return In RO 
; If offset non nil, still room 
; If no rooai, die for now. 

; Put ID In 1st slot 

; Put ADOR In 2nd slot 


POP RO 

POP Rl 

POP R2 

POP R3 

POP A2 

POP IP 

BRAT_ENTER XTRP ENO: 








BRAT_ENTER_NEW_XTRP — Add a new IO/AOOR pair to the BRAT 


Runs Under: AO Absolute Mode, Unchecked Mode 
Inputs: R0.R1 

Takes and ID/AOOR pair In RO & Rl end enters the pair Into the BRAT. The 
caller must be' sure that the ID Is not already In the BRAT, because 
no search is made for pre-exlstance. This routine Is Intended to 
be a fster way to enter Initial bindings, as In a NEW call. 


BRAT ENTER_NEW_XTRP: 

PUSH 

A2 

PUSH 

R3 

PUSH 

R1 

PUSH 

RO 

PUSH 

RO 

MOVE 

R1.R3 

DC 

VAR BRAT BASE 

MOVE 

[R0,A0],R1 

DC 

SYS_LEN_BITS 

LSH 

R1, RO, R1 

OC 

VAR_BRAT LENGTH 

OR 

R1.CR0,A0],R1 

WTAG 

R1,TAG_AOOR,R1 

MOVE 

R1.A2 

POP 

RO 

MOVE 

NIL.R1 

CALL 

TRAP_BRAT PEEK 

BNNIL 

RO. / '_BRAT_ENTER 1 

CALL 

TRAP DIE 

_8RAT_ENTER_NEV_0K: 

POP 

R1 

PUSH 

R1 

MOVE 

R1.CR0.A2] 

AOO 

R0.1.R0 

MOVE 

R3.CR0.A2] 

POP 

RO 

POP 

R1 

POP 

R3 

POP 

A2 

POP 

IP 


8RAT_ENTER_NEVJ<TRP_END: 


i Save RO 
; R3 <- AOOR 

i RO <- Offset to BRAT variable 
; R1 <- BRAT_BASE 

i Shift BRAT.BASE to addr field 

; R1 <- BRAT base | length 
; Cast R1 Into an AOOR 
; Move BRAT ptr Into A2 
; RO <• ID that was passed In 
; R1 <• NIL (find ampty slot) 

: Find offset A return in RO 
: If offset non nil, still room 
; If no room, die for now. 

: R1 <- ID 

; Push ID back on stack 
; Put 10 In 1st slot 

i Put AOOR In 2nd slot 



BRAT_XLATE_XTRP — XIat* an ID from the BRAT Into an ADOR 

Runs Under: AO Shadow, Unchecked Mode 
Inputs: RO 
Output: RO 

Takes the ID to lookup In the BRAT In RO. When the corresponding 
AOOR value Is found, It Is returned In RO. 


BRAT_X LATE_XTRP: 

PUSH A2 
PUSH R2 
PUSH R1 


MOVE R0,R2 

DC VAR_BRAT BASE 

MOVE [R0,A0],R1 

DC SYS_LEN BITS 

LSH R1.R0.R1 

OC VAR_BRAT_LENGTH 

OR R1,[R0,A0],R1 

WTAG R1.TAG ADOR.R1 

MOVE R1,A2 


R2 <- ID 

RO <- Offset to BRAT variable 
R1 <- BRAT.BASE 

Shift BRAT_BASE to addr field 

R2 <- BRAT base I length 
Cast R2 Into an AOOR 
Move BRAT ptr Into A2 


MOVE R2.R0 
MOVE R2,R1 

CALL TRAP_BRAT_P£EK ; Find offset ft return In RO 


BNIL RO,~_BRAT_XLATE_RETURN 

ADO RO,1,RO 

MOVE [R0,A2],R0 

_BRAT_XLATE RETURN: 

POP R1 

POP R2 

POP A2 

POP IP 

BRAT_XLATE XTRP END: 


If RO nil return the nil 
Pick out AOOR ft return In RO 




BRAT_PURGE_XTRP -- Purge an ID/ADOR pair from the BRAT 


Rune under: AO Shadow, Unchecked Mode 
Znputa: RO 

Enter with ID to purge In RO. The routine wrltee NIL Into both 
the ID 4 ADOR slot of the binding In the table. 


BRAT_PURGE_XTRP: 


PUSH 

A2 


PUSH 

R2 


PUSH 

R1 


PUSH 

RO 


MOVE 

R0.R2 

! R2 <- 10 

DC 

VAR BRAT BASE 

: RO <- Offset to BRAT variable 

MOVE 

[R0,A0],R1 

: R1 <- BRAT_BASE 

DC 

SYS LEN BITS 


LSH 

R1.R0.R1 

; Shift BRAT.BASE to addr field 

X 

VAR_BRAT LENGTH 


OR 

R1.CR0.A03.R1 

i R2 <- BRAT base I length 

WTAG 

R1,TAG_ADOR,R1 

i Cast R2 Into an ADOR 

MOVE 

R1,A2 

i Move BRAT ptr Into A2 

MOVE 

R2.RO 


MOVE 

R2.R1 


CALL 

TRAP.BRAT PEEK 

: Find offset & return in RO 

BNIL 

RO, ~_BRAT_PURGE_RETURN 

; If ID not In table, return 

MOVE 

R0,R1 


X 

SYMiO 


MOVE 

R0,[R1,A2] 


AX 

R1,1,R1 


MOVE 

R0.CR1.A2] 



_BRAT_PURGE RETURN: 
POP RO 

POP R1 

POP R2 

POP A2 

POP IP 

8RAT_PURGE XTRP ENO: 



MIGRATE_OBJECT_XTRP -- Takes an object ID and sends object to a node 

The ID of the object to migrate Is In RO, and the destination node 

number Is In R1. If the object Is not local, a MIGRATE OBJECT MSG 
message Is sent to the residence of the object. ” 

Runs under: AO absolute mode, unchecked 
Inputs: RO, R1 

Trashes: R2, R3 


MIGRATE OBJECT 
PUSH 
MOVE 
MOVE 
XLATE 
PUSH 
CHECK 
BT 

MIGRATE OBJECT 
SENO 
DC 

SEND 

POP 

SENDEE 

POP 

POP 

MIGRATE OBJECT 
PURGE * 
MOVE 
CALL 
AND 
OC 
AOO 
ADO 
SENDS 
POP 
SENO 
MOVE 
SENO 
MOVE 

MIGRATE OBJECT 


XTRP: 

I 

TRUE,R2 
R2.I 

R0.R2,XLATE ID TO NOOE 
RO ~ 

R2.TAG ADOR.R3 
R3,'MIGRATE_OBJECT LOCAL 
.FORVARO MESSAGE: 

R2 

MSG: (MIGRATE OBJECT MSG«SYS 
RO “ 

RO 

R0,R1 
I 

IP 

.LOCAL: 

RO 

XCALL BRAT PURGE.R3 
TRAP XCALL 
R2,SYS LEN MASK.R3 
MSG:SYS_UNC|(IMMIGRATE OBJECT 
R0.R3.R0 
RO,3,RO 
R1.R0 
RO 
RO 

NNR.RO 
RO 
0.R0 
LOOP: 


Save old I-Olseble flag 
R2 <- True 
Disable Interrupts 
R2 <- Address of ID In RO 
Save ID 

Is object local? 

If so, migrate It 

Send residence node # 
.LEN^BITS) |3 

Send message header 
Restore object ID 
Send object Id & node t 
Restore interrupts 
Return 

Remove binding from cache 
R3 <- Purge Xcall # 

Purge RO from BRAT 
. R3 <- Length of object 
■_MSG«SYS_LEN_BITS) 

Add length of object 
Add 3 for hdr, ID, this node 
Send node t, header 
RO <- ID 
Send ID 

RO <- This nod,, t 
Send this node number 
Current Index • 0 


MOVE 

SUB 

BZ 

SENO 

AOO 

8R 

MIGRATE OBJECT 
SENOE ' 
X 
OR 
MOVE 
POP 
POP 

MIGRATE_06JECT 


R2.A2 

R3.1.R3 

R3,'"MIGRATE OBJECT LAST 
CR0.A2] 

R0.1.R0 

'MIGRATE OBJECT LOOP 
LAST: 

[R0.A2] 

TAG_08JHEAD:SYS MARK MAS* 
RO, C 0, A2 ], RO " 

RO, [ 0, A2 ] 

I 

IP 

XTRP ENO: 


Copy object address to A2 

Decrement length 

If length ■ 0, send last word 

Mall out object word 

Increment Index 

Loop 

Send final object word 
RO <- Deletion mark mask 
Mark header deleted 
Store back into header 
Restore Interrupts 
Return 


EXCEPTION HANDLERS 


INVAOR_EXC — Exception handler for access of an Ax register with I bit set 
Runs under: AO absolute mode,unchecked 


INVAOR EXC: 

" PUSH 
PUSH 
PUSH 
PUSH 
MOVE 
X 
ANO 
X 
LSH 
EQUAL 
BT 

EQUAL 

8T ... ... 

INVADR_EXC NORMAL OPO: 

MOVE 07R3 

X XII 

ANO R2.R0.R2 

BR 'INVAOR EXC REXLATE 


RO 

R1 

R2 

R3 

TRP.H3 
SYS.OPO MASK 
R3,R0 r R2 

-(SYS.OPO BITS ♦ 2 ♦ 2) 

R3.R0.R1 

R1.2.R0 

RO,'INVADR_EXC REG ORIENTI 
R1,3,R0 

RO,'INVAOR_EXC_REG_ORIENTI 


R3 <■ Faulting Instruction 
RO <- Mask to keep OPO field 
R2 <- OPO field 
RO <- Bits to shift down 
R1 <- Opcode 
Is opcode 2 (REAOR)T 
If so, treat OPO special 
Is opcode 3 (WRITER)? 

If so, treat OPO special 

R3 <- 0 (means curr. priority) 
Mask to keep Ax bits 
R2 <- A Index 
Re-translate IDx -> Ax 











INVADR EXC REG ORIENTED: 

LSH R2,-(SYS_OPO BITS - 1),R3 
OC XII 

AND R2.R0.R2 

INVADR EXC REXLATE: 

LSH R3,2,R3 

OR R3 R2 R3 

INVADR EXC DISPATCH ON PAA: 

BR R3 

INVAOR_EXC_ID_LOADERS: 

MOVE IOO.RO 

BR "INVAOR EXC XLATE 

MOVE ID1.RO 

BR "INVAOR EXC XLATE 

MOVE ID2.R0 “ 

BR "INVAOR.EXC XLATE 

MOVE I03.R0 

BR "INVADR EXC XLATE 

MOVE IDO'.RO 

BR "INVADR EXC XLATE 

MOVE IDr.RO 

BR "INVADR EXC XLATE 

MOVE ID2\R0~ 

BR "INVADR EXC XLATE 

MOVE ID3 *, RO 

BR "INVADR_EXC XLATE 

INVAOR.EXC XLATE: 

XLATE R0.R1, XLATE.LOCAL 


R3 <- Relative priority 
Mask to keep Ax bits 
R2 <- A Index 


R3 <- (PAA) 

Branch forward R3 words 

RO <- IDO 
Branch and XLATE 
RO <- ID1 
Branch and XLATE 
RO <- IDE 
Branch and XLATE 
RO <- 103 
Branch and XLATE 
RO <- IDO * 

Branch and XLATE 
RO <- ID1' 

Branch and XLATE 
RO <- ID2' 

Branch and XLATE 
RO <- ID3' 

Branch and XLATE 

R1 <- Addr, Int, or NIL 


What Is object Isn’t herel If XLATE faults, we don’t save stacksl 


EARLY.EXC Exception handler for early Queue access 


; Runs under: 

; Trashes: 

AO shadow 

TEMPO 


EARLY EXC: 

MOVE 

POP 

VTAG 

LSH 

SUB 

LSH 

VTAG 

PUSH 

MOVE 

POP 

EARLY.EXC.ENO: 

RO.CTEMPO,AO] 

RO 

RO.TAG INT.RO 

RO,-9,R0 

R0.1.R0 

R0.9.R0 

RO.TAG IP.RO 

RO 

[TEMPO,AO],R0 

IP 

; Save RO In TEMPO 
; RO <- Return Address 
; Cast Into an INT 
; Shift RO to LSBIts 
i Back up address/phase 
; Shift address field back 
; Cast back Into an IP 
; Push return IP on stack 
; Restore RO 
i Retry Instruction 


SENO_EXC — Exception handler for send buffer overflow 


i Runs under: 

; Trashes: 

SENO EXC: 

MOVE 

POP 

VTAG 

LSH 

SUB 

LSH 

VTAG 

PUSH 

MOVE 

POP 

SEND EXC ENO: 


AO shadow 

TEMPO 


RO,[TEMPO,AO] 

RO 

RO.TAG INT.RO 

R0.-9.R0 

R0.1.R0 

R0.9.R0 

RO.TAG_IP,RO 

RO 

[TEMPO,AO].RO 

IP 

; Save RO In TEMPO 
; RO <- Return Address 
; Cast Into an INT 
; Shift RO to LSBIts 
: Back up address/phase 
; Shift address field back 
: Cast back into an IP 
: Push return IP on stack 
; Restore RO 
; Retry Instruction 


XLATE.EXC — Exception handler for translation fault 

Runs under: AO Absolute Mode, Unchecked 
Trashes: TEMPO-4 


XLATE.EXC: 

MOVE RO,[TEMPO,AO] 
MOVE R1,[TEMPI,AO] 
MOVE R2,[TEMP2,A0] 
MOVE R3,[TEMP3,A0] 

READR TRP.RO 
VTAG RO, TAG.INT.RO 


; Save data registers In 
; TEMPO - TEMP3 for use 
as an array 


: RO <- Current priority TRP 











MOVE 

RO,[TEMP4,AO] 

; TEMP4 <- Current priority TRP 

lsh 

R0,-7,R0 

; Pick out src. register field 

AND 

RO.X11.RO 

ADO 

ro.tempo.ro 

: Add TEMPO as start of array 

MOVE 

[R0,A0],R0 

; Load RO with source ID 

MOVE 

R0.R1 

; Copy ID to R1 

MOVE 

XCALL.BRAT XLATE,R3 


CALL 

TRAP_XCALL 

; See If 10 Is In BRAT 

8NIL 

RO,*XLATE_EXC_NO_BINOING 

; If not, handle no binding 

ENTER 

R1.R0 

; Enter pair in cache 

XLATE RETRY: 

POP 

R3 

: R3 <- Return IP 

LSH 

R3.-9.R3 

; Shift IP until phase Is LSB 

SUB 

R3.1.R3 

; Back up one phase 

LSH 

R3.9.R3 

; R3 <- Failed Inst. IP 

PUSH 

R3 

: Put retry IP on stack 

MOVE 

[TEMPO,AO],RO 

: Restore data registers 

MOVE 

[TEMPI, AO], R1 

MOVE 

[TEMP2,A0],R2 


MOVE 

[TEMP3,AO],R3 


POP 

IP 

Retry failed Instruction 

XLATE_EXC NO BINDING: 


MOVE 

[TEMP4,A0],R0 

RO <- Failed Instruction 

LSH 

RO,-(SYS OPO BITS+SYS OP1 BITS), 

R2 

DC 

(1 « SYS OP2 BITS) - 1 

RO <- mask to keep op2 field 

AND 

R2.R0.R2 

R2 <- XLATE mode from op2 

EQUAL 

R2,XLATE OBJ.RO 

Were we In XLATEJDBJ mode? 

BT 

RO, "'XLATE EXC OBJ MOOE 

If so, branch 

EQUAL 

R2,XLATE ID TO NOOE.RO 

Were we In XLATE_ID_TO_NOOE? 

BT 

RO,~XLATE_EXC ID TO NODE MOOE 

If so, branch 

EQUAL 

R2,XLATE METHOO.RO “ 

Were we In XLATE_METHOO mode? 

BT 

RO,''XLATE EXC METHOO MOOE JUMP 

If so, branch 

XLATE_EXC_LOCAL: ; «•« Dest must 

>e a data register! >*« 

MOVE 

TRP.R1 

R1 <- Failed XLATE 

DC 

XI111111 

RO <- Mask to keep Oest field 

AND 

R1.R0.R2 

R2 <- Oest field of XLATE 

ADO 

R2,TEMPO,R2 

R2 <- TempOCOest] 

MOVE 

NIL.RO 

RO <- NIL 

MOVE 

R0,[R2.A0] 

TampO[Dest] <- NIL 

MOVE 

[TEMPO,AO],R0 

Restore data registers 

MOVE 

[TEMPI,AO],R1 

MOVE 

[TEMP2,A0],R2 


MOVE 

[TEMP3,A0],R3 


POP 

IP 

Return 

XLATE EXC OBJ MODE: 


CALL 

TRAP_OIE 

Just die for now 

XLATE_EXC_METHOO MOOE JUMP: 


BR 

'XLATE_EXC_METHOO_MOOE 

Jump extender 

XLATE EXC ID TO 

_NOOE MOOE: 


MOVE” 

TRP.R1 

R1 <- Failed XLATE 

LSH 

R1,-7,R1 

Shift Source bits down 

ANO 

R1.X11.R1 

Just keep source bits 

ADO 

R1,TEMPO,R1 

R1 <- TEMPO ♦ Rs 

MOVE 

[R1,A0],R1 

R1 <- Source ID 

LSH 

R1.-SYS ID 10 BITS.R1 

Shift Blrthnode number down 

ANO 

R1.SYS 10 NODE MASK.R1 

Just keep node number field 

MOVE 

TRP.R2 

R2 <- Failed XLATE 

DC 

XT 111111 

RO <- Mask to keep Dest field 

ANO 

R2.R0.R2 

R2 <- Dest field of XLATE 

ADO 

R2, TEMPO, R2 

R1,[R2,A0] 

R2 <- TEMPO ♦ Oest (Rx only!) 

MOVE 

TEMP[Oest] • blrthnode number 

MOVE 

[TEMPO,AO],RO 

Restore data registers 

MOVE 

[TEMPI.AO],R1 

MOVE 

[TEMP2,A0],R2 


MOVE 

[TEH>3,A0],R3 


POP 

IP 

Return 


XLATE_EXC_METHOD_MOOE: 


POP 

R3 


LSH 

R3.-9.R3 

; Shift IP until phase Is LSB 

SUB 

R3,1,R3 

; Back up one phase 

LSH 

R3.9.R3 

; R3 <- Failed Inst. IP 


; Now R1 holds source ID, 

1 retry IP is in R3 

XLATE EXC SAVE 

MSG: 


PUSH 

R1 

; Save away R1 

PUSH 

102 

; Push ID2 on stack 

MOVE 

[0,A3],R2 

: R2 <- Message header 






OC 

SYS_LEN MASK 

ANO 

R0.R2.R8 

ADO 

R2.2.R0 

MOVE 

CLASS_ME5SACE, R1 

CALL 

TRAP NEW 

XLATE 

R0.A2,XLATE OBJ 

PUSH 

RO 

A00 

R2.2.R1 

_EXC_COPY_MSG: 

BZ 

R8,^XLATE_EXC_MAKE CONTEXT 

SUB 

R2.1.R8 

SUB 

R1.1.R1 

MOVE 

[ R2, A3 ], RO 

MOVE 

R0,[R1,A2] 

BR 

"XLATE_EXC_COPY_MSG 

EXC MAKE 

.CONTEXT: 

MOVE 

0.R0 

CALL 

trap_new_context 

PUSH 

I 

MOVE 

TRUE.RO 

MOVE 

RO, I 

MOVE 

A1.R0 

LSH 

RO,-SYS_LEN_BITS,RO 

ADO 

RO,[CONT_PSTATE OFFSET.A2], 

LSH 

RO,SYS_LEN_BITS,RO 

ADO 

RO, [ CONT PSTATE_OPFSET, A8 ], 

ADO 

RO, 1, RO 

MOVE 

R0.A8 


AO -> 7777 IOO -> 7777 

A 1 -> Context 101 -> Context 

*2 -> Pstate 108 -> 7777 

A3 -> 7777 103 -> 7777 


i RO <- Meek to keep Ian bits 
: R2 <- Length of msg 
: RO <- Length ♦ 8 words hdr 
i R1 <- Class for copied msg 
; Make an object to hold msg 
! A8 <- Address of object 
: Push msg object 10 on stack 

; R1 <- Length ♦ 2 words hdr 

; If no length, done copying 
; Decrement source Index 
; Decrement dest Index 
i RO <- word from queue 
; Copy Into msg object 
; Loop 


; No local space needed 
: A2 <- Context address 

RO <- True 
Disable interrupts 
RO <- Pointer to ctxt 
Shift addr portion down 
Add pstate offset to addr 
Shift addr portion back up 
Add In length - i 
RO <- AOOR:<ps_«ddrXps len> 
A8 <- Pointer to pstate - 


F111 IP slot of context 
HOVE R3.[PSTATE_IP,A2] 

Pill ID slots In context 
POP R3 

HOVE R3,[PSTATE 103,A81 
POP R3 ~ 

HOVE R3.[PSTATE 108.A8] 

READ# ID1.R3 

HOVE R3,[PSTATE ID1.A81 

REAOR ID0.R3 

MOVE R3.[P3TATE_ID0,A8] 


; Context IP <- becked up IP 

; Point 103 to msg object 
; 108 Is on stock 


Pill Rx slots In context 


MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 


[TEMPO,AO],R3 
R3,[PSTATE_R0,A2] 
[TEMPI,AO],R3 
R3,[PSTATE_R1.A8] 
[TEMP8,A0].R3 
R3,[PSTATE_R8,A8] 
[TEMP3,A0],R3 
R3.[PSTATE_R3,A2] 


CHECK R1.TA6 CS,R3 

BF R3, '“XLATE_EXC_REQUEST_METHOO 

XLATE_EXC_LOOKUP METHOO: 

MOVE NNR.R3 

S«0! 

£« tSS KJjmrjmo 

SENOE [OBJECT ID,A8] 

SUSPENO 


; Does Teg • class/selectorT 
i If not, we were xlatlng an id 


; R3 <- This node number 
; RO <- header 
; Send node,header 
i RO <- 10 of LookupMethod code 
; Send LookupMethod ID,c/s 
; Send context to reply to 


XLATE_EXC_REQUEST METHOO: 

OC VAR_RCACHE_BASE 

MOVE [RO.AOLR8 

OC VAR_MCACHE_L£NGTH 

MOVE [R0,A0],R3 

MOVE NIL.RO 

MOVE RO,[TEMPS,AO] 

POP I 

POP R1 


: R2 <- Base of method cache 
: R3 <- Length of method cache 
; TE7T* <- NIL 

i Get R1 beck (clean up later) 


Now R1 holds the method ID, R2 holds the base < 
the method cache, and R3 holds the length of tr 
method cache 


ADO 


R2.R3.R8 


; R2 <- Offset past mcache 









XLATE.EXC_SEARCH_MC.ID: 


SUB 

SUB 

EO 

BT 

MOVE 

BNNIL 

MOVE 

BNNIL 

MOVE 


R2,2,R2 
R3.2.R3 
Ri. CR2 .ao].ro 
R0,~XLATE_EXC_FOUNO MC ID 
CR2,A0],R0 ~ " 

RO.-'XLATE.EXC MC LOOP 
CTEMPA.AOJ.RO - - 
RO.-'XLATE.EXC MC LOOP 
R2.CTEMP4.A0] - “ 


XLATE_EXC_MC_LOOP: 


8NZ 

MOVE 

BNNIL 


R3,~XLATE_EXC SEARCH MC ID 
CTEMP4,A0],R0 - " 

RO.-'XLATE exc GOT ROOM 


XLATE.EXC.ENTER.IN.OVERFLOW LIST: 

MOVE R1.CCONT RESOURCE,A2] 

DC VAR_MCACHE_OVERFLOW LIST 

MOVE R0.R2 

MOVE CR0,A0],R0 

move RO,CCONT_NEXT CONTEXT,A2] 

move COBJECT I0,A2],R0 

MOVE R0.CR2.A0] 

BR "X LATE_EXC_MAIL.ORDER.METHOO 


; Decrement offset 
; Decrement length 
: Is this the id we went? 

: IT so, sdd context to list 

; IT entry not nil, loop again 

; IT TEMP4 is non-nil, loop 
; Entry Is nil, so Till 
; TE1^4 with oTTset to this 
: empty place. 


; IT length !• 0, loop 


: IT TEMP4 not nil, we Tound an 
; empty space In the table. 

: Resource • Method ID 
: R0 <- Overflow list addr 
; Copy to R2 

; R0 <- Car of overflow list 
; Next context * rest of list 
; R0 <- Context-ID 
: OTlow list <- Context-ID 
; Mall Tor method 


XLATE.EXC.GOT.ROOM: 

MOVE CTEMP4.A0LR2 

MOVE R1.CR2.A0] 

XLATE_EXC_F0UND MC ID: 

AOO - R27l,R2 
MOVE CR2.A0],R0 

MOVE COBJECT ID, A2 ], R3 

MOVE R3.CR2.A0] 

MOVE RO,C CONT.NEXT.CONTEXT,A2‘ 


R2 <- Empty slot offset 
Fill MC ID with method ID 

Point oTTset to wait list 
R0 <- (car wait-list) 

R3 <- Context-IO 
Point wait-list to context 
Point child slot to the 
rest of welt-list (or nil) 


i u! W k W * . S, J up the w,it ll5t for The method. 

, We have to mall off a method request to the hoewtown 
; node of the method In question (ID In Rl). 


XLATE_£XC_MAIL_OROER METHOO: 

PUSH Rl ~ 

CALL TRAP.IO.TO NODE 
MOVE R1,R3 

POP Rl 

OC MSG:(METHOO REQUEST 

SEN02 R3.R0 

READR NNR.R3 

SEN02E R1,R3 

SUSPEND 
XLATE EXC END: 


; Save IO 

; Rl <- Node number of ID 
; Move to R3 
; Restore ID 
i_LEN_BITS)|3|SYS_UNC 

; Send dest node i t message 
i R3 <- This node number 
; Send method-ID t this node t 
; Walt Tor method reply 


EXC. VECTORS: 
DC 
DC 
DC 
DC 
DC 
DC 
X 
X 
DC 
DC 
DC 
DC 
DC 
DC 
DC 
DC 
DC 
DC 
DC 
DC 
X 
X 
X 
DC 
DC 
DC 
DC 
DC 
OC 


IP 

IP 

IP: 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 

IP 


•' STS-ABS | (8KGO_EXC«SYS_LEN BITS) 

! *i*- A8S • <EMPTY_FAULT«SYS LEN BITS) 
! ®XS_ABS I ( EK»TY_FAULT«SYS - LEN - BITS) 

: SYS.ABS I (EMPTY_FAULT«SYS - LEN - BITS ) 

1 fif-? 8 * * ^ EM P TY _FAULT<<SYS LENBITS) 
fif-? 88 ' (EARLY_£XC«SYS LEN BITS) 

SYS~an*! / g2™-. FA0 ‘-T«SVS_LEN_BIT5) 

.SYS_ABS| (E*»TY_FAULT«SYS LEN BITS) 

! fif-?S! ! $ ®*^- f *ult«sysIlen - bits ) 
! ?if-?5® * ^ empty - f *ULT«SYS len - bits ) 

: fif - ? 8 *' ^SENO_EXC«SYS_LEN BITS) 

1 fvf“sof! S j*- A8 S1( XLATE.EXCXXSYS LEN 
.SYS.ABS I (E>FTY_FAULT«SYS LEN BITS)’ 
! fif-* 8 * 1 ( F USH_EXC«SYS_LEN BITS ) 

1 SYS.ABSI(POP_EXC< <SYS_LEN BITS) 
fjf-* 8 * * ^ ^^-EAULT< <SYS~LEN BITS) 
SYS.ABS | (E*>TY_FAULT«SYS:LEN - BITS) 

fif-* 8 * * ^ o^y_ f ault«sys:len - bits ) 

SYS.ABS I ( EMPTY_FAULT«SYs3lEN - BITS ) 
SYS.ABS | (EMPTY_FAULT«SYS - LEN - 8ITS) 
SYS.ABS | (EMPTY_FAULT«SYS - LEN - BITS) 
SYS.ABS | (ET»»TY_FAULT«SYS - LEN - BITS) 
SYS.ABS | (EMPTY_FAULT«SYS J.EN~BITS) 
SYS.ABS | (EMPTY_FAULT«SYS LEN~BITS) 
SYS.ABS j ( EMPTY.FAULT<<SYS - LEN - BITS) 
SYS.ABS j ( EMPTY_FAULT<<SYSIlEN - BITS ) 
SYS.ABS j (EMPTY_FAULT<<SYS_LEN - BITS) 
SYS.ABS|(EMPTY.FAULT<<SYS LEN BITS) 
SYS.ABSj(EMPTY_FAULT<<SYSJ.ENJJITS) 


I OBLFAULT 
i ILGINST 
; ILGADRMD 
; ACCESS 

i LIMIT 
i INVADR 
; MSG 
; QUEUE 

.BITS) 
i RANGE 


OVERFLOW 

TYPE 

IA 

IB 

IC 

ID 

IE 

IF 




OC 

IP 

OC 

IP 

DC 

IP 

OC 

IP 

DC 

IP 

OC 

IP 

OC 

IP 

OC 

IP 

OC 

IP 

DC 

IP 

OC 

IP 

IP 

OC 

DC 

IP 

OC 

IP 

DC 

IP 

OC 

IP 

OC 

IP 

DC 

IP 

DC 

IP 

rrORS.END: 

SECTORS: 

DC 

IP 

DC 

IP 

OC 

IP 

DC 

IP 

OC 

IP 

DC 

IP 

OC 

IP 

DC 

IP 

OC 

IP 

DC 

IP 

DC 

IP 

OC 

IP 

OC 

IP 

OC 

IP 

DC 

IP 

OC 

IP 

DC 

IP 

OC 

IP 

OC 

IP 


S_ABS| (EMPTY FAULT«SYS_LEN BITS) 

LABSI(EMPTY FAULK<SYS LEN~8ITS) 

LABS| (EMPTY FAULT«SYS LEN BITS) 
i_UNC|SYS_ABSI(NEW CONTEXT_TRP«SYS LEN BITS) 

>_UNC|SYS_ABS| (FREE.CONTEXT TRP«SYS LEN BITS) 
LABSKXFER 10 TRP«SYS_LEN_BITS) “ “ 

LA8SKXFER AOOR TRP«SYS LEN BITS) 

»_A8S|(ID_TO NOOE TRP«SYS LEN BITS) 

LUNC|SYS_ABS|(NEW TRP«SYS LEN BITS) 
l_UNC|SYS_AB5|(MALL0C_TRP«SYS LEN BITS) 
i_ABS|(GENID_TRP«SYS LEN BITSj ~ 

LABS| (VERSION_TRP«$YS LEN BITS) 

_UNC|SYS.A8S| (BRAT.PEEK TRP«SYS LEN BITS) 

'.UNC)SYS.ABS| (SWEEP TRP«SYS LEN'BITS) 

_UNCISYS.ABSl(FREE_SPECIFIED~CONTEXT TRP«SYS LEN BITS) 
_ABS|(EMPTY_TRAP«SYS LEN BITS) " * ‘ ' 

_ABS | ( empty_trap«sys"len"bits) 

_UNC I SYS.ABS |(XCALL_TRP«SYS LEN BITS) 
E_TRP«SYS_LEN_BITS) ” " 


XCALL_VECTORS_ENO: 


SYS 

SYS 

SYS 

SYS 

SYS 

SYS 

SYS 

SYS 

SYS 

SYS 

SYS 

SYS 

SYS 


.ABS KEMPTYJ<CALL<<SYS LEN BITS) 

i.UNC|SYS.ABS| (BRAT.ENTER XTRP«SYS LEN BITS) 

; .UNC | SYS.ABS | (BRAT_XLATE XTRP«SYS _ LEN _ BITS ) 

.UNC | SYS_ABS | (BRAT.PURGE XTRP«SYS _ LEN _ BITS) 

.UNC|SYS_ABS|(MIGRATE_OBJECT XTRP<<SYS _ LEN BITS) 
.ABS | SYS_ABS I (BRAT ENTER NEW XTRP«SYS - LEN - 8ITS) 
_ABSKEMPTYJ<CALL<<SYS LEN BITS) 

.ABS|(EMPTYJ<CALL<<SYS LEN'BITS) 
ABS|(EMPTY_XCALL«SYS LEN'BITS) 
ABS|(EMPTY_XCALL«SYS LEN'BITS) 

ABS |( EMPTYJ<CALL«SYS _ LEN~BITS) 

ABS | (EMPTY_XCALL«SYS _ LEN — BITS) 

ABS|(EMPTY.XCALL«SYS L£N _ BITS) 

ABS | (EMPTY_XCALL«SYS - LEN _ BITS) 

abs i ( empty_xcall«sys - len - bits ) 

ABS I (EMPTY_XCALL«SYS~LEN _ BITS) 

absi(empty_xcall«sys~len - bits) 

ABSKEMPTYJ<CALL<<SYS LEN~BITS) 

ABS | (EW>TY_XCALL«SYS~LEN - BITS) 


ROM Constants 


ROM.VERSION: DC 

ROM.SIZEi OC 

TWIOOLE: OC 

ROM END: 

END 


INT:(1«16)|0 

INT:(ROM.ENO - 1024) 

0 . 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0,0 










JOSS Quick Reference 


Pranttm Message Handlers 


Name 

Arguments 

Description 

WRITE 

(dest-address) (data)* 

Fills the block of memory at 
<dest-address> with the data 
contained in the message. The 
<dest-address> word must be a 
proper ADDR-tagged value. 

READ 

(src-address) (reply-node) (reply-hdr) 

Reads the block of memory 
starting at <src-address> and 
mails the data back to the 
<reply-node> in a message 
whose header is <reply-hdr>. 

CALL 

(method-id) (args)* 

Starts up the method with ID 
<method-id>. The <args> are 
used by the task being started. 

SEND 

(selector) (receiver-id) (args)* 

Starts up the method that 
performs the operation indicated 
by <selector> on the object with 
ID <receiver-id>. The process 
started uses the <args>. 

REPLY 

(context-ID) (context-slot) (value) 

Places a value in the specified 
slot <context-slot> of the context 
with ID <context-id>. If the 
context was waiting for this slot, 
it will be restarted. 

NEW_METHOD 

(class) (selector) (code)* 

Allocates storage for a new 
method, copies the <code> into 
the method object, and installs 
the <class> and <selector> to 
method ID bindings in the 
system table. 

NEW 

(size) (class) (id) (selector) (data)* 

Allocates a new object of type 
<class> on a remote node with 
length <size>, copies the 
optional <data> into the object, 
and when done, sends the 
<selector> to the object with 

ID <id>. 

REST ART_CONTEXT 

(context-id) 

Queues the context with ID 
<context-id> for execution. 

MIGRATE.OBJECT 

(object-id) (node-number) 

Moves the object with ID 
<object-id> to node number 
<node-number> 



System Culls 



Name 

Arguments 

Description 

XCALL 

Xcall routine number in R3 

Calls one of the routines defined in 
the extended call vector table. This 
was implemented since the CALL 
vector table was running out of room. 

SWEEP 

— 

Compacts the heap. 

NEW_CONTEXT 

Size of user space in RO 

This routine creates a new context 
object with RO words of user space 
and returns the context address in Al 
and A2. RO is trashed. 

NEW 

Size of object in RO 

Class of object in R1 

Creates a new object of size RO and 
class Rl, and returns the object's ID 
in RO. Rl gets trashed. 

ID_TO_NODE 

Object ID in R1 

Returns a likely node for the object 
with ID Rl to be on in Rl. 

MALLOC 

Block size in RO 

Allocates RO words of physical 
memory and returns the address in 

A2. 

FREE_CONTEXT Context ID to firee in ID1 

FREE_SPECEFIED_CONTEXT 

Context ID to free in RO 

Frees the context with ID in ID1, 
possibly placing it on the context 
free list 

Frees die context with ID in RO, 
possibly placing it on the context 
freelist This trashes RO and Rl. 

GENID 

— 

Generates a new ID, and returns the 

ID in RO. 

VERSION 


Returns the OS version number in 

RO, where the high 16 bits hold the 
major value, and the low 16 bits the 
minor value. 

XFERJD 

Context ID to restart in RO 

Transfers control to the context whose 
ID is in RO. This never returns. 

XFER_ADDR 

Context address in A1 

Transfers control to the context whose 
ID is in Al. This never returns. 

BRAT_PEEK 

ID to hash in RO 

ID to search for in R1 

Base of BRAT table in A2 

Hashes the ID in RO to find a first 
slot in the BRAT to search. A linear 
search proceeds from there until the ED 
in Rl is found. When found, the offset 
from the start of the BRAT where this 
entry is located is returned. If not 
found, NIL is returned. 



Extended System Cells 

Name 

BRAT_ENTER 

BRAT_XLATE 

BRAT_PURGE 

MIGRATE_OBJECT 


Arguments 

ID to enter in BRAT in RO 
Address in R1 

ID to lookup in BRAT in RO 

ID to purge from BRAT in RO 

ID of object to migrate in RO 
Node to migrate object to in R1 


Description 

Enters the ID/ADDR pair 
R0/R1 into the BRAT. 

Looks RO up in the BRAT and 
returns the bound value in RO. 

Removes the first binding of RO from 
the BRAT. 

Migrates the object whose ID is in RO 
to the node whose number is in Rl. 
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